arXiv daily: Optimization and Control (math.OC)
1.Chemotherapy planning and multi-appointment scheduling: formulations, heuristics and bounds
Authors:Giuliana Carello, Mauro Passacantando, Elena Tanfani
Abstract: The number of new cancer cases is expected to increase by about 50% in the next 20 years, and the need for chemotherapy treatments will increase accordingly. Chemotherapy treatments are usually performed in outpatient cancer centers where patients affected by different types of tumors are treated. The treatment delivery must be carefully planned to optimize the use of limited resources, such as drugs, medical and nursing staff, consultation and exam rooms, and chairs and beds for the drug infusion. Planning and scheduling chemotherapy treatments involve different problems at different decision levels. In this work, we focus on the patient chemotherapy multi-appointment planning and scheduling problem at an operational level, namely the problem of determining the day and starting time of the oncologist visit and drug infusion for a set of patients to be scheduled along a short-term planning horizon. We use a per-pathology paradigm, where the days of the week in which patients can be treated, depending on their pathology, are known. We consider different metrics and formulate the problem as a multi-objective optimization problem tackled by sequentially solving three problems in a lexicographic multi-objective fashion. The ultimate aim is to minimize the patient's discomfort. The problems turn out to be computationally challenging, thus we propose bounds and ad-hoc approaches, exploiting alternative problem formulations, decomposition, and $k$-opt search. The approaches are tested on real data from an Italian outpatient cancer center and outperform state-of-the-art solvers.
2.Bilinear control of semilinear elliptic PDEs: Convergence of a semismooth Newton method
Authors:Eduardo Casas, Konstantinos Chrysafinos, Mariano Mateos
Abstract: In this paper, we carry out the analysis of the semismooth Newton method for bilinear control problems related to semilinear elliptic PDEs. We prove existence, uniqueness and regularity for the solution of the state equation, as well as differentiability properties of the control to state mapping. Then, first and second order optimality conditions are obtained. Finally, we prove the superlinear convergence of the semismooth Newton method to local solutions satisfying no-gap second order sufficient optimality conditions as well as a strict complementarity condition.
3.Online Mixed Discrete and Continuous Optimization: Algorithms, Regret Analysis and Applications
Authors:Lintao Ye, Ming Chi, Zhi-Wei Liu, Xiaoling Wang, Vijay Gupta
Abstract: We study an online mixed discrete and continuous optimization problem where a decision maker interacts with an unknown environment for a number of $T$ rounds. At each round, the decision maker needs to first jointly choose a discrete and a continuous actions and then receives a reward associated with the chosen actions. The goal for the decision maker is to maximize the accumulative reward after $T$ rounds. We propose algorithms to solve the online mixed discrete and continuous optimization problem and prove that the algorithms yield sublinear regret in $T$. We show that a wide range of applications in practice fit into the framework of the online mixed discrete and continuous optimization problem, and apply the proposed algorithms to solve these applications with regret guarantees. We validate our theoretical results with numerical experiments.
4.Tulipa Energy Model: Mathematical Formulation
Authors:Diego A. Tejada-Arango, Germán Morales-España, Lauren Clisby, Ni Wang, Abel S. Siqueira, Ali Subayu, Laurent Soucasse, Zhi Gao
Abstract: Tulipa Energy Model aims to optimise the investment and operation of the electricity market, considering its coupling with other sectors, such as hydrogen and heat, that can also be electrified. The problem is analysed from the perspective of a central planner who determines the expansion plan that is most beneficial for the system as a whole, either by maximising social welfare or by minimising total costs. The formulation provides a general description of the objective function and constraints in the optimisation model based on the concept of energy assets representing any element in the model. The model uses subsets and specific methods to determine the constraints that apply to a particular technology or network, allowing more flexibility in the code to consider new technologies and constraints with different levels of detail in the future.
5.Optimal inexactness schedules for Tunable Oracle based Methods
Authors:Guillaume Van Dessel, François Glineur
Abstract: Several recent works address the impact of inexact oracles in the convergence analysis of modern first-order optimization techniques, e.g. Bregman Proximal Gradient and Prox-Linear methods as well as their accelerated variants, extending their field of applicability. In this paper, we consider situations where the oracle's inexactness can be chosen upon demand, more precision coming at a computational price counterpart. Our main motivations arise from oracles requiring the solving of auxiliary subproblems or the inexact computation of involved quantities, e.g. a mini-batch stochastic gradient as a full-gradient estimate. We propose optimal inexactness schedules according to presumed oracle cost models and patterns of worst-case guarantees, covering among others convergence results of the aforementioned methods under the presence of inexactness. Specifically, we detail how to choose the level of inexactness at each iteration to obtain the best trade-off between convergence and computational investments. Furthermore, we highlight the benefits one can expect by tuning those oracles' quality instead of keeping it constant throughout. Finally, we provide extensive numerical experiments that support the practical interest of our approach, both in offline and online settings, applied to the Fast Gradient algorithm.
6.Learning to Warm-Start Fixed-Point Optimization Algorithms
Authors:Rajiv Sambharya, Georgina Hall, Brandon Amos, Bartolomeo Stellato
Abstract: We introduce a machine-learning framework to warm-start fixed-point optimization algorithms. Our architecture consists of a neural network mapping problem parameters to warm starts, followed by a predefined number of fixed-point iterations. We propose two loss functions designed to either minimize the fixed-point residual or the distance to a ground truth solution. In this way, the neural network predicts warm starts with the end-to-end goal of minimizing the downstream loss. An important feature of our architecture is its flexibility, in that it can predict a warm start for fixed-point algorithms run for any number of steps, without being limited to the number of steps it has been trained on. We provide PAC-Bayes generalization bounds on unseen data for common classes of fixed-point operators: contractive, linearly convergent, and averaged. Applying this framework to well-known applications in control, statistics, and signal processing, we observe a significant reduction in the number of iterations and solution time required to solve these problems, through learned warm starts.
7.Mean-field games of speedy information access with observation costs
Authors:Dirk Becherer, Christoph Reisinger, Jonathan Tam
Abstract: We investigate a mean-field game (MFG) in which agents can exercise control actions that affect their speed of access to information. The agents can dynamically decide to receive observations with less delay by paying higher observation costs. Agents seek to exploit their active information gathering by making further decisions to influence their state dynamics to maximize rewards. In the mean field equilibrium, each generic agent solves individually a partially observed Markov decision problem in which the way partial observations are obtained is itself also subject of dynamic control actions by the agent. Based on a finite characterisation of the agents' belief states, we show how the mean field game with controlled costly information access can be formulated as an equivalent standard mean field game on a suitably augmented but finite state space.We prove that with sufficient entropy regularisation, a fixed point iteration converges to the unique MFG equilibrium and yields an approximate $\epsilon$-Nash equilibrium for a large but finite population size. We illustrate our MFG by an example from epidemiology, where medical testing results at different speeds and costs can be chosen by the agents.
8.Acceleration by Stepsize Hedging I: Multi-Step Descent and the Silver Stepsize Schedule
Authors:Jason M. Altschuler, Pablo A. Parrilo
Abstract: Can we accelerate convergence of gradient descent without changing the algorithm -- just by carefully choosing stepsizes? Surprisingly, we show that the answer is yes. Our proposed Silver Stepsize Schedule optimizes strongly convex functions in $k^{\log_{\rho} 2} \approx k^{0.7864}$ iterations, where $\rho=1+\sqrt{2}$ is the silver ratio and $k$ is the condition number. This is intermediate between the textbook unaccelerated rate $k$ and the accelerated rate $\sqrt{k}$ due to Nesterov in 1983. The non-strongly convex setting is conceptually identical, and standard black-box reductions imply an analogous accelerated rate $\varepsilon^{-\log_{\rho} 2} \approx \varepsilon^{-0.7864}$. We conjecture and provide partial evidence that these rates are optimal among all possible stepsize schedules. The Silver Stepsize Schedule is constructed recursively in a fully explicit way. It is non-monotonic, fractal-like, and approximately periodic of period $k^{\log_{\rho} 2}$. This leads to a phase transition in the convergence rate: initially super-exponential (acceleration regime), then exponential (saturation regime).
1.Maximum Principle for Mean Field Type Control Problems with General Volatility Functions
Authors:Alain Bensoussan, Ziyu Huang, Sheung Chi Phillip Yam
Abstract: In this paper, we study the maximum principle of mean field type control problems when the volatility function depends on the state and its measure and also the control, by using our recently developed method. Our method is to embed the mean field type control problem into a Hilbert space to bypass the evolution in the Wasserstein space. We here give a necessary condition and a sufficient condition for these control problems in Hilbert spaces, and we also derive a system of forward-backward stochastic differential equations.
2.Nonlinear network identifiability: The static case
Authors:Renato Vizuete, Julien M. Hendrickx
Abstract: We analyze the problem of network identifiability with nonlinear functions associated with the edges. We consider a static model for the output of each node and by assuming a perfect identification of the function associated with the measurement of a node, we provide conditions for the identifiability of the edges in a specific class of functions. First, we analyze the identifiability conditions in the class of all nonlinear functions and show that even for a path graph, it is necessary to measure all the nodes except by the source. Then, we consider analytic functions satisfying $f(0)=0$ and we provide conditions for the identifiability of paths and trees. Finally, by restricting the problem to a smaller class of functions where none of the functions is linear, we derive conditions for the identifiability of directed acyclic graphs. Some examples are presented to illustrate the results.
3.Barzilai-Borwein Descent Methods for Multiobjective Optimization Problems with Variable Trade-off Metrics
Authors:Jian Chen, Liping Tang, Xinmin Yang
Abstract: The imbalances and conditioning of the objective functions influence the performance of first-order methods for multiobjective optimization problems (MOPs). The latter is related to the metric selected in the direction-finding subproblems. Unlike single-objective optimization problems, capturing the curvature of all objective functions with a single Hessian matrix is impossible. On the other hand, second-order methods for MOPs use different metrics for objectives in direction-finding subproblems, leading to a high per-iteration cost. To balance per-iteration cost and better curvature exploration, we propose a Barzilai-Borwein descent method with variable metrics (BBDMO\_VM). In the direction-finding subproblems, we employ a variable metric to explore the curvature of all objectives. Subsequently, Barzilai-Borwein's method relative to the variable metric is applied to tune objectives, which mitigates the effect of imbalances. We investigate the convergence behaviour of the BBDMO\_VM, confirming fast linear convergence for well-conditioned problems relative to the variable metric. In particular, we establish linear convergence for problems that involve some linear objectives. These convergence results emphasize the importance of metric selection, motivating us to approximate the trade-off of Hessian matrices to better capture the geometry of the problem. Comparative numerical results confirm the efficiency of the proposed method, even when applied to large-scale and ill-conditioned problems.
4.On the Intelligent Proportional Controller Applied to Linear Systems
Authors:Mohamed Camil Belhadjoudja, Mohamed Maghenem, Emmanuel Witrant
Abstract: We analyze in this paper the effect of the well known intelligent proportional controller on the stability of linear control systems. Inspired by the literature on neutral time delay systems and advanced type systems, we derive sufficient conditions on the order of the control system, under which, the used controller fails to achieve exponential stability. Furthermore, we obtain conditions, relating the system s and the control parameters, such that the closed-loop system is either unstable or not exponentially stable. After that, we provide cases where the intelligent proportional controller achieves exponential stability. The obtained results are illustrated via numerical simulations, and on an experimental benchmark that consists of an electronic throttle valve.
5.Dynamical convergence analysis for nonconvex linearized proximal ADMM algorithms
Authors:Jiahong Guo, Xiao Wang, Xiantao Xiao
Abstract: The convergence analysis of optimization algorithms using continuous-time dynamical systems has received much attention in recent years. In this paper, we investigate applications of these systems to analyze the convergence of linearized proximal ADMM algorithms for nonconvex composite optimization, whose objective function is the sum of a continuously differentiable function and a composition of a possibly nonconvex function with a linear operator. We first derive a first-order differential inclusion for the linearized proximal ADMM algorithm, LP-ADMM. Both the global convergence and the convergence rates of the generated trajectory are established with the use of Kurdyka-\L{}ojasiewicz (KL) property. Then, a stochastic variant, LP-SADMM, is delved into an investigation for finite-sum nonconvex composite problems. Under mild conditions, we obtain the stochastic differential equation corresponding to LP-SADMM, and demonstrate the almost sure global convergence of the generated trajectory by leveraging the KL property. Based on the almost sure convergence of trajectory, we construct a stochastic process that converges almost surely to an approximate critical point of objective function, and derive the expected convergence rates associated with this stochastic process. Moreover, we propose an accelerated LP-SADMM that incorporates Nesterov's acceleration technique. The continuous-time dynamical system of this algorithm is modeled as a second-order stochastic differential equation. Within the context of KL property, we explore the related almost sure convergence and expected convergence rates.
6.Absorbing Markov Decision Processes
Authors:François Dufour, Tomás Prieto-Rumeau
Abstract: In this paper, we study discrete-time absorbing Markov Decision Processes (MDP) with measurable state space and Borel action space with a given initial distribution. For such models, solutions to the characteristic equation that are not occupation measures may exist. Several necessary and sufficient conditions are provided to guarantee that any solution to the characteristic equation is an occupation measure. Under the so-called continuity-compactness conditions, it is shown that the set of occupation measures is compact in the weak-strong topology if and only if the model is uniformly absorbing. Finally, it is shown that the occupation measures are characterized by the characteristic equation and an additional condition. Several examples are provided to illustrate our results.
7.Complexity analysis of regularization methods for implicitly constrained least squares
Authors:Akwum Onwunta, Clément W. Royer
Abstract: Optimization problems constrained by partial differential equations (PDEs) naturally arise in scientific computing, as those constraints often model physical systems or the simulation thereof. In an implicitly constrained approach, the constraints are incorporated into the objective through a reduced formulation. To this end, a numerical procedure is typically applied to solve the constraint system, and efficient numerical routines with quantifiable cost have long been developed. Meanwhile, the field of complexity in optimization, that estimates the cost of an optimization algorithm, has received significant attention in the literature, with most of the focus being on unconstrained or explicitly constrained problems. In this paper, we analyze an algorithmic framework based on quadratic regularization for implicitly constrained nonlinear least squares. By leveraging adjoint formulations, we can quantify the worst-case cost of our method to reach an approximate stationary point of the optimization problem. Our definition of such points exploits the least-squares structure of the objective, leading to an efficient implementation. Numerical experiments conducted on PDE-constrained optimization problems demonstrate the efficiency of the proposed framework.
8.Optimal adaptive control with separable drift uncertainty
Authors:Samuel N. Cohen, Christoph Knochenhauer, Alexander Merkel
Abstract: We consider a problem of stochastic optimal control with separable drift uncertainty in strong formulation on a finite horizon. The drift coefficient of the state $Y^{u}$ is multiplicatively influenced by an unknown random variable $\lambda$, while admissible controls $u$ are required to be adapted to the observation filtration. Choosing a control actively influences the state and information acquisition simultaneously and comes with a learning effect. The problem, initially non-Markovian, is embedded into a higher-dimensional Markovian, full information control problem with control-dependent filtration and noise. To that problem, we apply the stochastic Perron method to characterize the value function as the unique viscosity solution to the HJB equation, explicitly construct $\varepsilon$-optimal controls and show that the values of strong and weak formulations agree. Numerical illustrations show a significant difference between the adaptive control and the certainty equivalence control.
1.Relating Electric Vehicle Charging to Speed Scaling with Job-Specific Speed Limits
Authors:Leoni Winschermann, Marco E. T. Gerards, Antonios Antoniadis, Gerwin Hoogsteen, Johann Hurink
Abstract: Due to the ongoing electrification of transport in combination with limited power grid capacities, efficient ways to schedule electric vehicles (EVs) are needed for intraday operation of, for example, large parking lots. Common approaches like model predictive control repeatedly solve a corresponding offline problem. In this work, we present and analyze the Flow-based Offline Charging Scheduler (FOCS), an offline algorithm to derive an optimal EV charging schedule for a fleet of EVs that minimizes an increasing, convex and differentiable function of the corresponding aggregated power profile. To this end, we relate EV charging to mathematical speed scaling models with job-specific speed limits. We prove our algorithm to be optimal. Furthermore, we derive necessary and sufficient conditions for any EV charging profile to be optimal.
2.Inexact Decentralized Dual Gradient Tracking for Constraint-Coupled Optimization
Authors:Jingwang Li, Housheng Su
Abstract: We propose an inexact decentralized dual gradient tracking method (iDDGT) for distributed optimization problems with a globally coupled equality constraint. Unlike existing algorithms that rely on either the exact dual gradient or an inexact one obtained through single-step gradient descent, iDDGT introduces a new approach: utilizing an inexact dual gradient with controllable levels of inexactness. Numerical experiments demonstrate that iDDGT achieves significantly higher computational efficiency compared to state-of-the-art methods. Furthermore, it is proved that iDDGT can achieve linear convergence over directed graphs without imposing any conditions on the constraint matrix. This expands its applicability beyond existing algorithms that require the constraint matrix to have full row rank and undirected graphs for achieving linear convergence.
3.Stochastic Bridges over Ensemble of Linear Systems
Authors:Daniel Owusu Adu, Yongxin Chen
Abstract: We consider particles that are conditioned to initial and final states. The trajectory of these particles is uniquely shaped by the intricate interplay of internal and external sources of randomness. The internal randomness is aptly modelled through a parameter varying over a deterministic set, thereby giving rise to an ensemble of systems. Concurrently, the external randomness is introduced through the inclusion of white noise. Within this context, our primary objective is to effectively generate the stochastic bridge through the optimization of a random differential equation. As a deviation from the literature, we show that the optimal control mechanism, pivotal in the generation of the bridge, does not conform to the typical Markov strategy. Instead, it adopts a non-Markovian strategy, which can be more precisely classified as a stochastic feedforward control input. This unexpected divergence from the established strategies underscores the complex interrelationships present in the dynamics of the system under consideration.
4.Symmetric Stair Preconditioning of Linear Systems for Parallel Trajectory Optimization
Authors:Xueyi Bu, Brian Plancher
Abstract: There has been a growing interest in parallel strategies for solving trajectory optimization problems. One key step in many algorithmic approaches to trajectory optimization is the solution of moderately-large and sparse linear systems. Iterative methods are particularly well-suited for parallel solves of such systems. However, fast and stable convergence of iterative methods is reliant on the application of a high-quality preconditioner that reduces the spread and increase the clustering of the eigenvalues of the target matrix. To improve the performance of these approaches, we present a new parallel-friendly symmetric stair preconditioner. We prove that our preconditioner has advantageous theoretical properties when used in conjunction with iterative methods for trajectory optimization such as a more clustered eigenvalue spectrum. Numerical experiments with typical trajectory optimization problems reveal that as compared to the best alternative parallel preconditioner from the literature, our symmetric stair preconditioner provides up to a 34% reduction in condition number and up to a 25% reduction in the number of resulting linear system solver iterations.
1.Optimization Method Based On Optimal Control
Authors:Yeming Xu, Ziyuan Guo, Hongxia Wang, Huanshui Zhang
Abstract: In this paper, we focus on a method based on optimal control to address the optimization problem. The objective is to find the optimal solution that minimizes the objective function. We transform the optimization problem into optimal control by designing an appropriate cost function. Using Pontryagin's Maximum Principle and the associated forward-backward difference equations (FBDEs), we derive the iterative update gain for the optimization. The steady system state can be considered as the solution to the optimization problem. Finally, we discuss the compelling characteristics of our method and further demonstrate its high precision, low oscillation, and applicability for finding different local minima of non-convex functions through several simulation examples.
2.Simba: A Scalable Bilevel Preconditioned Gradient Method for Fast Evasion of Flat Areas and Saddle Points
Authors:Nick Tsipinakis, Panos Parpas
Abstract: The convergence behaviour of first-order methods can be severely slowed down when applied to high-dimensional non-convex functions due to the presence of saddle points. If, additionally, the saddles are surrounded by large plateaus, it is highly likely that the first-order methods will converge to sub-optimal solutions. In machine learning applications, sub-optimal solutions mean poor generalization performance. They are also related to the issue of hyper-parameter tuning, since, in the pursuit of solutions that yield lower errors, a tremendous amount of time is required on selecting the hyper-parameters appropriately. A natural way to tackle the limitations of first-order methods is to employ the Hessian information. However, methods that incorporate the Hessian do not scale or, if they do, they are very slow for modern applications. Here, we propose Simba, a scalable preconditioned gradient method, to address the main limitations of the first-order methods. The method is very simple to implement. It maintains a single precondition matrix that it is constructed as the outer product of the moving average of the gradients. To significantly reduce the computational cost of forming and inverting the preconditioner, we draw links with the multilevel optimization methods. These links enables us to construct preconditioners in a randomized manner. Our numerical experiments verify the scalability of Simba as well as its efficacy near saddles and flat areas. Further, we demonstrate that Simba offers a satisfactory generalization performance on standard benchmark residual networks. We also analyze Simba and show its linear convergence rate for strongly convex functions.
3.Computing Wasserstein Barycenter via operator splitting: the method of averaged marginals
Authors:D. Mimouni IFPEN, P Malisani IFPEN, J. Zhu IFPEN, W. de Oliveira CMA
Abstract: The Wasserstein barycenter (WB) is an important tool for summarizing sets of probabilities. It finds applications in applied probability, clustering, image processing, etc. When the probability supports are finite and fixed, the problem of computing a WB is formulated as a linear optimization problem whose dimensions generally exceed standard solvers' capabilities. For this reason, the WB problem is often replaced with a simpler nonlinear optimization model constructed via an entropic regularization function so that specialized algorithms can be employed to compute an approximate WB efficiently. Contrary to such a widespread inexact scheme, we propose an exact approach based on the Douglas-Rachford splitting method applied directly to the WB linear optimization problem for applications requiring accurate WB. Our algorithm, which has the interesting interpretation of being built upon averaging marginals, operates series of simple (and exact) projections that can be parallelized and even randomized, making it suitable for large-scale datasets. As a result, our method achieves good performance in terms of speed while still attaining accuracy. Furthermore, the same algorithm can be applied to compute generalized barycenters of sets of measures with different total masses by allowing for mass creation and destruction upon setting an additional parameter. Our contribution to the field lies in the development of an exact and efficient algorithm for computing barycenters, enabling its wider use in practical applications. The approach's mathematical properties are examined, and the method is benchmarked against the state-of-the-art methods on several data sets from the literature.
4.Dynamic Pricing in an Energy Community Providing Capacity Limitation Services
Authors:Bennevis Crowley, Jalal Kazempour, Lesia Mitridati
Abstract: This paper proposes a mathematical framework for dynamic pricing in an energy community to enable the provision of capacity limitation services to the distribution grid. In this framework, the energy community complies with a time-variant limit on its maximum power import from the distribution grid in exchange for grid tariff discounts. A bi-level optimization model is developed to implicitly coordinate the energy usage of prosumers within the community. In the upper-level problem, the community manager minimizes the total operational cost of the community based on reduced grid tariffs and power capacity limits by setting time-variant and prosumer-specific prices. In the lower-level problem, each prosumer subsequently adjusts their energy usage over a day to minimize their individual operational cost. This framework allows the community manager to maintain central economic market properties such as budget balance and individual rationality for prosumers. We show how the community benefits can be allocated to prosumers either in an equal or a proportional manner. The proposed model is eventually reformulated into a mixed integer second-order cone program and thereafter applied to a distribution grid case study.
5.Convergence analysis of the semismooth Newton method for sparse control problems governed by semilinear elliptic equations
Authors:Casas Eduardo, Mateos Mariano
Abstract: We show that a second order sufficient condition for local optimality, along with a strict complementarity condition, is enough to get the super-linear convergence of the semismooth Newton method for an optimal control problem governed by a semilinear elliptic equation. The objective functional may include a sparsity promoting term and we allow for box control constraints. We also obtain quadratic convergence under quite natural assumptions on the data of the control problem.
6.Turnpike and dissipativity in generalized discrete-time stochastic linear-quadratic optimal control
Authors:Jonas Schießl, Ruchuan Ou, Timm Faulwasser, Michael Heinrich Baumann, Lars Grüne
Abstract: We investigate different turnpike phenomena of generalized discrete-time stochastic linear-quadratic optimal control problems. Our analysis is based on a novel strict dissipativity notion for such problems, in which a stationary stochastic process replaces the optimal steady state of the deterministic setting. We show that from this time-varying dissipativity notion, we can conclude turnpike behaviors concerning different objects like distributions, moments, or sample paths of the stochastic system and that the distributions of the stationary pair can be characterized by a stationary optimization problem. The analytical findings are illustrated by numerical simulations.
7.Algorithms for DC Programming via Polyhedral Approximations of Convex Functions
Authors:Fahaar Mansoor Pirani, Firdevs Ulus
Abstract: There is an existing exact algorithm that solves DC programming problems if one component of the DC function is polyhedral convex (Loehne, Wagner, 2017). Motivated by this, first, we consider two cutting-plane algorithms for generating an $\epsilon$-polyhedral underestimator of a convex function g. The algorithms start with a polyhedral underestimator of g and the epigraph of the current underestimator is intersected with either a single halfspace (Algorithm 1) or with possibly multiple halfspaces (Algorithm 2) in each iteration to obtain a better approximation. We prove the correctness and finiteness of both algorithms, establish the convergence rate of Algorithm 1, and show that after obtaining an $\epsilon$-polyhedral underestimator of the first component of a DC function, the algorithm from (Loehne, Wagner, 2017) can be applied to compute an $\epsilon$ solution of the DC programming problem without further computational effort. We then propose an algorithm (Algorithm 3) for solving DC programming problems by iteratively generating a (not necessarily $\epsilon$-) polyhedral underestimator of g. We prove that Algorithm 3 stops after finitely many iterations and it returns an $\epsilon$-solution to the DC programming problem. Moreover, the sequence $\{x_k\}_{k\geq 0} outputted by Algorithm 3 converges to a global minimizer of the DC problem when $\epsilon$ is set to zero. Computational results based on some test instances from the literature are provided.
8.Energy-optimal Timetable Design for Sustainable Metro Railway Networks
Authors:Shuvomoy Das Gupta, Bart P. G. Van Parys, J. Kevin Tobin
Abstract: We present our collaboration with Thales Canada Inc, the largest provider of communication-based train control (CBTC) systems worldwide. We study the problem of designing energy-optimal timetables in metro railway networks to minimize the effective energy consumption of the network, which corresponds to simultaneously minimizing total energy consumed by all the trains and maximizing the transfer of regenerative braking energy from suitable braking trains to accelerating trains. We propose a novel data-driven linear programming model that minimizes the total effective energy consumption in a metro railway network, capable of computing the optimal timetable in real-time, even for some of the largest CBTC systems in the world. In contrast with existing works, which are either NP-hard or involve multiple stages requiring extensive simulation, our model is a single linear programming model capable of computing the energy-optimal timetable subject to the constraints present in the railway network. Furthermore, our model can predict the total energy consumption of the network without requiring time-consuming simulations, making it suitable for widespread use in managerial settings. We apply our model to Shanghai Railway Network's Metro Line 8 -- one of the largest and busiest railway services in the world -- and empirically demonstrate that our model computes energy-optimal timetables for thousands of active trains spanning an entire service period of one day in real-time (solution time less than one second on a standard desktop), achieving energy savings between approximately 20.93% and 28.68%. Given the compelling advantages, our model is in the process of being integrated into Thales Canada Inc's industrial timetable compiler.
9.Safe Adaptive Control of Hyperbolic PDE-ODE Cascades
Authors:Ji Wang, Miroslav Krstic
Abstract: Adaptive safe control employing conventional continuous infinite-time adaptation requires that the initial conditions be restricted to a subset of the safe set due to parametric uncertainty, where the safe set is shrunk in inverse proportion to the adaptation gain. The recent regulation-triggered adaptive control approach with batch least-squares identification (BaLSI, pronounced ``ballsy'') completes perfect parameter identification in finite time and offers a previously unforeseen advantage in adaptive safe control, which we elucidate in this paper. Since the true challenge of safe control is exhibited for CBF of a high relative degree, we undertake a safe BaLSI design in this paper for a class of systems that possess a particularly extreme relative degree: ODE-PDE-ODE sandwich systems. Such sandwich systems arise in various applications, including delivery UAV with a cable-suspended load. Collision avoidance of the payload with the surrounding environment is required. The considered class of plants is $2\times2$ hyperbolic PDEs sandwiched by a strict-feedback nonlinear ODE and a linear ODE, where the unknown coefficients, whose bounds are known and arbitrary, are associated with the PDE in-domain coupling terms that can cause instability and with the input signal of the distal ODE. This is the first safe adaptive control design for PDEs, where we introduce the concept of PDE CBF whose non-negativity as well as the ODE CBF's non-negativity are ensured with a backstepping-based safety filter. Our safe adaptive controller is explicit and operates in the entire original safe set.
10.A distributionally robust index tracking model with the CVaR penalty: tractable reformulation
Authors:Ruyu Wang, Yaozhong Hu, Chao Zhang
Abstract: We propose a distributionally robust index tracking model with the conditional value-at-risk (CVaR) penalty. The model combines the idea of distributionally robust optimization for data uncertainty and the CVaR penalty to avoid large tracking errors. The probability ambiguity is described through a confidence region based on the first-order and second-order moments of the random vector involved. We reformulate the model in the form of a min-max-min optimization into an equivalent nonsmooth minimization problem. We further give an approximate discretization scheme of the possible continuous random vector of the nonsmooth minimization problem, whose objective function involves the maximum of numerous but finite nonsmooth functions. The convergence of the discretization scheme to the equivalent nonsmooth reformulation is shown under mild conditions. A smoothing projected gradient (SPG) method is employed to solve the discretization scheme. Any accumulation point is shown to be a global minimizer of the discretization scheme. Numerical results on the NASDAQ index dataset from January 2008 to July 2023 demonstrate the effectiveness of our proposed model and the efficiency of the SPG method, compared with several state-of-the-art models and corresponding methods for solving them.
11.An exact algorithm for linear optimization problem subject to max-product fuzzy relational inequalities with fuzzy constraints
Authors:Amin Ghodousian, Romina Omidi
Abstract: Fuzzy relational inequalities with fuzzy constraints (FRI-FC) are the generalized form of fuzzy relational inequalities (FRI) in which fuzzy inequality replaces ordinary inequality in the constraints. Fuzzy constraints enable us to attain optimal points (called super-optima) that are better solutions than those resulted from the resolution of the similar problems with ordinary inequality constraints. This paper considers the linear objective function optimization with respect to max-product FRI-FC problems. It is proved that there is a set of optimization problems equivalent to the primal problem. Based on the algebraic structure of the primal problem and its equivalent forms, some simplification operations are presented to convert the main problem into a more simplified one. Finally, by some appropriate mathematical manipulations, the main problem is transformed into an optimization model whose constraints are linear. The proposed linearization method not only provides a super-optimum (that is better solution than ordinary feasible optimal solutions) but also finds the best super-optimum for the main problem. The current approach is compared with our previous work and some well-known heuristic algorithms by applying them to random test problems in different sizes.
1.Optimal strategies for mosquitoes replacement techniques: influence of the carrying capacity on spatial releases
Authors:Luis Almeida, Jesús Bellver Arnau, Gwenaël Peltier, Nicolas Vauchelet
Abstract: This work is devoted to the mathematical study of an optimization problem regarding control strategies of mosquito population in a heterogeneous environment. Mosquitoes are well known to be vectors of diseases, but, in some cases, they have a reduced vector capacity when carrying the endosymbiotic bacterium Wolbachia. We consider a mathematical model of a replacement strategy, consisting in rearing and releasing Wolbachia-infected mosquitoes to replace the wild population. We investigate the question of optimizing the release protocol to have the most effective replacement when the environment is heterogeneous. In other words we focus on the question: where to release, given an inhomogeneous environment, in order to maximize the replacement across the domain. To do so, we consider a simple scalar model in which we assume that the carrying capacity is space dependent. Then, we investigate the existence of an optimal release profile and prove some interesting properties. In particular, neglecting the mobility of mosquitoes and under some assumptions on the biological parameters, we characterize the optimal releasing strategy for a short time horizon, and provide a way to reduce to a one-dimensional optimization problem the case of a long time horizon. Our theoretical results are illustrated with several numerical simulations.
2.A hybrid physics-informed neural network based multiscale solver as a partial differential equation constrained optimization problem
Authors:Michael Hintermüller, Denis Korolev
Abstract: In this work, we study physics-informed neural networks (PINNs) constrained by partial differential equations (PDEs) and their application in approximating multiscale PDEs. From a continuous perspective, our formulation corresponds to a non-standard PDE-constrained optimization problem with a PINN-type objective. From a discrete standpoint, the formulation represents a hybrid numerical solver that utilizes both neural networks and finite elements. We propose a function space framework for the problem and develop an algorithm for its numerical solution, combining an adjoint-based technique from optimal control with automatic differentiation. The multiscale solver is applied to a heat transfer problem with oscillating coefficients, where the neural network approximates a fine-scale problem, and a coarse-scale problem constrains the learning process. We show that incorporating coarse-scale information into the neural network training process through our modelling framework acts as a preconditioner for the low-frequency component of the fine-scale PDE, resulting in improved convergence properties and accuracy of the PINN method. The relevance and potential applications of the hybrid solver to computational homogenization and material science are discussed.
1.Local properties and augmented Lagrangians in fully nonconvex composite optimization
Authors:Alberto De Marchi, Patrick Mehlitz
Abstract: A broad class of optimization problems can be cast in composite form, that is, considering the minimization of the composition of a lower semicontinuous function with a differentiable mapping. This paper discusses the versatile template of composite optimization without any convexity assumptions. First- and second-order optimality conditions are discussed, advancing the variational analysis of compositions. We highlight the difficulties that stem from the lack of convexity when dealing with necessary conditions in a Lagrangian framework and when considering error bounds. Building upon these characterizations, a local convergence analysis is delineated for a recently developed augmented Lagrangian method, deriving rates of convergence in the fully nonconvex setting.
2.An optimal control approach for the treatment of hepatitis C patients
Authors:Anh-Tuan Nguyen, Hien Tran
Abstract: In this article, the feasibility of using optimal control theory will be studied to develop control theoretic methods for personalized treatment of HCV patients. The mathematical model for HCV progression includes compartments for healthy hepatocytes, infected hepatocytes, infectious virions and noninfectious virions. Methodologies have been used from optimal control theory to design and synthesize an open-loop control based treatment regimen for HCV dynamics.
3.PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates
Authors:Zachary Frangella, Pratik Rathore, Shipu Zhao, Madeleine Udell
Abstract: This paper introduces PROMISE ($\textbf{Pr}$econditioned Stochastic $\textbf{O}$ptimization $\textbf{M}$ethods by $\textbf{I}$ncorporating $\textbf{S}$calable Curvature $\textbf{E}$stimates), a suite of sketching-based preconditioned stochastic gradient algorithms for solving large-scale convex optimization problems arising in machine learning. PROMISE includes preconditioned versions of SVRG, SAGA, and Katyusha; each algorithm comes with a strong theoretical analysis and effective default hyperparameter values. In contrast, traditional stochastic gradient methods require careful hyperparameter tuning to succeed, and degrade in the presence of ill-conditioning, a ubiquitous phenomenon in machine learning. Empirically, we verify the superiority of the proposed algorithms by showing that, using default hyperparameter values, they outperform or match popular tuned stochastic gradient optimizers on a test bed of $51$ ridge and logistic regression problems assembled from benchmark machine learning repositories. On the theoretical side, this paper introduces the notion of quadratic regularity in order to establish linear convergence of all proposed methods even when the preconditioner is updated infrequently. The speed of linear convergence is determined by the quadratic regularity ratio, which often provides a tighter bound on the convergence rate compared to the condition number, both in theory and in practice, and explains the fast global linear convergence of the proposed methods.
4.A Prescriptive Trilevel Equilibrium Model for Optimal Emissions Pricing and Sustainable Energy Systems Development
Authors:Olli Herrala, Steven A. Gabriel, Fabricio Oliveira, Tommi Ekholm
Abstract: We explore the class of trilevel equilibrium problems with a focus on energy-environmental applications. In particular, we apply this trilevel framework to a power market model, exploring the possibilities of an international policymaker in reducing emissions of the system. We present two alternative solution methods for such problems and a comparison of the resulting model sizes. The first method is based on a reformulation of the bottom-level solution set, and the second one uses strong duality. The first approach results in optimality conditions that are both necessary and sufficient, while the second one results in a model with fewer constraints but only sufficient optimality conditions. Using the proposed methods, we are able to obtain globally optimal solutions for a realistic five-node case study representing the Nordic countries and assess the impact of a carbon tax on the electricity production portfolio.
5.Backward error analysis and the qualitative behaviour of stochastic optimization algorithms: Application to stochastic coordinate descent
Authors:Stefano Di Giovacchino, Desmond J. Higham, Konstantinos Zygalakis
Abstract: Stochastic optimization methods have been hugely successful in making large-scale optimization problems feasible when computing the full gradient is computationally prohibitive. Using the theory of modified equations for numerical integrators, we propose a class of stochastic differential equations that approximate the dynamics of general stochastic optimization methods more closely than the original gradient flow. Analyzing a modified stochastic differential equation can reveal qualitative insights about the associated optimization method. Here, we study mean-square stability of the modified equation in the case of stochastic coordinate descent.
6.Finite dimensional backstepping controller design
Authors:Varga Kalantarov, Türker Özsarı, Kemal Cem Yılmaz
Abstract: We introduce a finite dimensional version of backstepping controller design for stabilizing solutions of PDEs from boundary. Our controller uses only a finite number of Fourier modes of the state of solution, as opposed to the classical backstepping controller which uses all (infinitely many) modes. We apply our method to the reaction-diffusion equation, which serves only as a canonical example but the method is applicable also to other PDEs whose solutions can be decomposed into a slow finite-dimensional part and a fast tail, where the former dominates the evolution in large time. One of the main goals is to estimate the sufficient number of modes needed to stabilize the plant at a prescribed rate. In addition, we find the minimal number of modes that guarantee the stabilization at a certain (unprescribed) decay rate. Theoretical findings are supported with numerical solutions.
7.Lifting functionals defined on maps to measure-valued maps via optimal transport
Authors:Hugo Lavenant
Abstract: How can one lift a functional defined on maps from a space X to a space Y into a functional defined on maps from X into P(Y) the space of probability distributions over Y? Looking at measure-valued maps can be interpreted as knowing a classical map with uncertainty, and from an optimization point of view the main gain is the convexification of Y into P(Y). We will explain why trying to single out the largest convex lifting amounts to solve an optimal transport problem with an infinity of marginals which can be interesting by itself. Moreover we will show that, to recover previously proposed liftings for functionals depending on the Jacobian of the map, one needs to add a restriction of additivity to the lifted functional.
8.An Efficient Semi-Real-Time Algorithm for Path Planning in the Hamilton-Jacobi Formulation
Authors:Christian Parkinson, Kyle Polage
Abstract: We present a semi-real-time algorithm for minimal-time optimal path planning based on optimal control theory, dynamic programming, and Hamilton-Jacobi (HJ) equations. Partial differential equation (PDE) based optimal path planning methods are well-established in the literature, and provide an interpretable alternative to black-box machine learning algorithms. However, due to the computational burden of grid-based PDE solvers, many previous methods do not scale well to high dimensional problems and are not applicable in real-time scenarios even for low dimensional problems. We present a semi-real-time algorithm for optimal path planning in the HJ formulation, using grid-free numerical methods based on Hopf-Lax formulas. In doing so, we retain the intepretablity of PDE based path planning, but because the numerical method is grid-free, it is efficient and does not suffer from the curse of dimensionality, and thus can be applied in semi-real-time and account for realistic concerns like obstacle discovery. This represents a significant step in averting the tradeoff between interpretability and efficiency. We present the algorithm with application to synthetic examples of isotropic motion planning in two-dimensions, though with slight adjustments, it could be applied to many other problems.
9.First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians
Authors:Nikita Doikov, Geovani Nunes Grapiglia
Abstract: In this work, we develop first-order (Hessian-free) and zero-order (derivative-free) implementations of the Cubically regularized Newton method for solving general non-convex optimization problems. For that, we employ finite difference approximations of the derivatives. We use a special adaptive search procedure in our algorithms, which simultaneously fits both the regularization constant and the parameters of the finite difference approximations. It makes our schemes free from the need to know the actual Lipschitz constants. Additionally, we equip our algorithms with the lazy Hessian update that reuse a previously computed Hessian approximation matrix for several iterations. Specifically, we prove the global complexity bound of $\mathcal{O}( n^{1/2} \epsilon^{-3/2})$ function and gradient evaluations for our new Hessian-free method, and a bound of $\mathcal{O}( n^{3/2} \epsilon^{-3/2} )$ function evaluations for the derivative-free method, where $n$ is the dimension of the problem and $\epsilon$ is the desired accuracy for the gradient norm. These complexity bounds significantly improve the previously known ones in terms of the joint dependence on $n$ and $\epsilon$, for the first-order and zeroth-order non-convex optimization.
10.Crack propagation in anisotropic brittle materials: from a phase-field model to a shape optimization approach
Authors:Tim Suchan, Chaitanya Kandekar, Wolfgang E. Weber, Kathrin Welker
Abstract: The phase-field method is based on the energy minimization principle which is a geometric method for modeling diffusive cracks that are popularly implemented with irreversibility based on Griffith's criterion. This method requires a length-scale parameter that smooths the sharp discontinuity, which influences the diffuse band and results in mesh-sensitive fracture propagation results. Recently, a novel approach based on the optimization on Riemannian shape spaces has been proposed, where the crack path is realized by techniques from shape optimization. This approach requires the shape derivative, which is derived in a continuous sense and used for a gradient-based algorithm to minimize the energy of the system. Due to the continuous derivation of the shape derivative, this approach yields mesh-independent results. In this paper, the novel approach based on shape optimization is presented, followed by an assessment of the predicted crack path in anisotropic brittle material using numerical calculations from a phase-field model.
1.Model Predictive Control using MATLAB
Authors:Midhun T. Augustine
Abstract: This tutorial consists of a brief introduction to the modern control approach called model predictive control (MPC) and its numerical implementation using MATLAB. We discuss the basic concepts and numerical implementation of the two major classes of MPC: Linear MPC (LMPC) and Nonlinear MPC (NMPC). This includes the various aspects of MPC such as formulating the optimization problem, constraints handling, feasibility, stability, and optimality.
2.Convergence Analysis of the Best Response Algorithm for Time-Varying Games
Authors:Zifan Wang, Yi Shen, Michael M. Zavlanos, Karl H. Johansson
Abstract: This paper studies a class of strongly monotone games involving non-cooperative agents that optimize their own time-varying cost functions. We assume that the agents can observe other agents' historical actions and choose actions that best respond to other agents' previous actions; we call this a best response scheme. We start by analyzing the convergence rate of this best response scheme for standard time-invariant games. Specifically, we provide a sufficient condition on the strong monotonicity parameter of the time-invariant games under which the proposed best response algorithm achieves exponential convergence to the static Nash equilibrium. We further illustrate that this best response algorithm may oscillate when the proposed sufficient condition fails to hold, which indicates that this condition is tight. Next, we analyze this best response algorithm for time-varying games where the cost functions of each agent change over time. Under similar conditions as for time-invariant games, we show that the proposed best response algorithm stays asymptotically close to the evolving equilibrium. We do so by analyzing both the equilibrium tracking error and the dynamic regret. Numerical experiments on economic market problems are presented to validate our analysis.
3.Urban Logistics in Amsterdam: A Modal Shift from Roadways to Waterway
Authors:Nadia Pourmohammad-Zia, Mark van Koningsveld
Abstract: The efficiency of urban logistics is vital for economic prosperity and quality of life in cities. However, rapid urbanization poses significant challenges, such as congestion, emissions, and strained infrastructure. This paper addresses these challenges by proposing an optimal urban logistic network that integrates urban waterways and last-mile delivery in Amsterdam. The study highlights the untapped potential of inland waterways in addressing logistical challenges in the city center. The problem is formulated as a two-echelon location routing problem with time windows, and a hybrid solution approach is developed to solve it effectively. The proposed algorithm consistently outperforms existing approaches, demonstrating its effectiveness in solving existing benchmarks and newly developed instances. Through a comprehensive case study, the advantages of implementing a waterway-based distribution chain are assessed, revealing substantial cost savings (approximately 28%) and reductions in vehicle weight (about 43%) and travel distances (roughly 80%) within the city center. The incorporation of electric vehicles further contributes to environmental sustainability. Sensitivity analysis underscores the importance of managing transshipment location establishment costs as a key strategy for cost efficiencies and reducing reliance on delivery vehicles and road traffic congestion. This study provides valuable insights and practical guidance for managers seeking to enhance operational efficiency, reduce costs, and promote sustainable transportation practices. Further analysis is warranted to fully evaluate the feasibility and potential benefits, considering infrastructural limitations and canal characteristics.
4.Enhancing PGA Tour Performance: Leveraging ShotlinkTM Data for Optimization and Prediction
Authors:Matthieu Guillot, Gautier Stauffer
Abstract: In this study, we demonstrate how data from the PGA Tour, combined with stochastic shortest path models (MDPs), can be employed to refine the strategies of professional golfers and predict future performances. We present a comprehensive methodology for this objective, proving its computational feasibility. This sets the stage for more in-depth exploration into leveraging data available to professional and amateurs for strategic optimization and forecasting performance in golf. For the replicability of our results, and to adapt and extend the methodology and prototype solution, we provide access to all our codes and analyses (R and C++).
5.Directional Tykhonov well-posedness for optimization problems and variational inequalities
Authors:Vo Ke Hoang, Vo Si Trong Long
Abstract: By using the so-called minimal time function, we propose and study a novel notion of directional Tykhonov well-posedness for optimization problems, which is an extension of the widely acknowledged notion of Tykhonov. In this way, we first provide some characterizations of this notion in terms of the diameter of level sets and admissible functions. Then, we investigate relationships between the level sets and admissible functions mentioned above. Finally, we apply the technology developed before to study directional Tykhonov well-posedness for variational inequalities. Several examples are presented as well to illustrate the applicability of our results.
6.Integral Quadratic Constraints with Infinite-Dimensional Channels
Authors:Aleksandr Talitckii, Peter Seiler, Matthew M. Peet
Abstract: Modern control theory provides us with a spectrum of methods for studying the interconnection of dynamic systems using input-output properties of the interconnected subsystems. Perhaps the most advanced framework for such input-output analysis is the use of Integral Quadratic Constraints (IQCs), which considers the interconnection of a nominal linear system with an unmodelled nonlinear or uncertain subsystem with known input-output properties. Although these methods are widely used for Ordinary Differential Equations (ODEs), there have been fewer attempts to extend IQCs to infinite-dimensional systems. In this paper, we present an IQC-based framework for Partial Differential Equations (PDEs) and Delay Differential Equations (DDEs). First, we introduce infinite-dimensional signal spaces, operators, and feedback interconnections. Next, in the main result, we propose a formulation of hard IQC-based input-output stability conditions, allowing for infinite-dimensional multipliers. We then show how to test hard IQC conditions with infinite-dimensional multipliers on a nominal linear PDE or DDE system via the Partial Integral Equation (PIE) state-space representation using a sufficient version of the Kalman-Yakubovich-Popov lemma (KYP). The results are then illustrated using four example problems with uncertainty and nonlinearity.
7.Online Distributed Learning over Random Networks
Authors:Nicola Bastianello, Diego Deplano, Mauro Franceschelli, Karl H. Johansson
Abstract: The recent deployment of multi-agent systems in a wide range of scenarios has enabled the solution of learning problems in a distributed fashion. In this context, agents are tasked with collecting local data and then cooperatively train a model, without directly sharing the data. While distributed learning offers the advantage of preserving agents' privacy, it also poses several challenges in terms of designing and analyzing suitable algorithms. This work focuses specifically on the following challenges motivated by practical implementation: (i) online learning, where the local data change over time; (ii) asynchronous agent computations; (iii) unreliable and limited communications; and (iv) inexact local computations. To tackle these challenges, we introduce the Distributed Operator Theoretical (DOT) version of the Alternating Direction Method of Multipliers (ADMM), which we call the DOT-ADMM Algorithm. We prove that it converges with a linear rate for a large class of convex learning problems (e.g., linear and logistic regression problems) toward a bounded neighborhood of the optimal time-varying solution, and characterize how the neighborhood depends on~$\text{(i)--(iv)}$. We corroborate the theoretical analysis with numerical simulations comparing the DOT-ADMM Algorithm with other state-of-the-art algorithms, showing that only the proposed algorithm exhibits robustness to (i)--(iv).
1.Optimal Stopping of BSDEs with Constrained Jumps and Related Zero-Sum Games
Authors:Magnus Perninge
Abstract: In this paper, we introduce a non-linear Snell envelope which at each time represents the maximal value that can be achieved by stopping a BSDE with constrained jumps. We establish the existence of the Snell envelope by employing a penalization technique and the primary challenge we encounter is demonstrating the regularity of the limit for the scheme. Additionally, we relate the Snell envelope to a finite horizon, zero-sum stochastic differential game, where one player controls a path-dependent stochastic system by invoking impulses, while the opponent is given the opportunity to stop the game prematurely. Importantly, by developing new techniques within the realm of control randomization, we demonstrate that the value of the game exists and is precisely characterized by our non-linear Snell envelope.
2.Interior point methods in optimal control problems of affine systems: Convergence results and solving algorithms
Authors:Paul Malisani IFPEN
Abstract: This paper presents an interior point method for pure-state and mixed-constrained optimal control problems for dynamics, mixed constraints, and cost function all affine in the control variable. This method relies on resolving a sequence of two-point boundary value problems of differential and algebraic equations. This paper establishes a convergence result for primal and dual variables of the optimal control problem. A primal and a primal-dual solving algorithm are presented, and a challenging numerical example is treated for illustration. Accepted for publication at SIAM SICON 2023
3.Investigating Sparse Reconfigurable Intelligent Surfaces (SRIS) via Maximum Power Transfer Efficiency Method Based on Convex Relaxation
Authors:Hans-Dieter Lang, Michel A. Nyffenegger, Heinz Mathis, Xingqi Zhang
Abstract: Reconfigurable intelligent surfaces (RISs) are widely considered to become an integral part of future wireless communication systems. Various methodologies exist to design such surfaces; however, most consider or require a very large number of tunable components. This not only raises system complexity, but also significantly increases power consumption. Sparse RISs (SRISs) consider using a smaller or even minimal number of tunable components to improve overall efficiency while maintaining sufficient RIS capability. The versatile semidefinite relaxation-based optimization method previously applied to transmit array antennas is adapted and applied accordingly, to evaluate the potential of different SRIS configurations. Because the relaxation is tight in all cases, the maximum possible performance is found reliably. Hence, with this approach, the trade-off between performance and sparseness of SRIS can be analyzed. Preliminary results show that even a much smaller number of reconfigurable elements, e.g. only 50%, can still have a significant impact.
4.On solving a rank regularized minimization problem via equivalent factorized column-sparse regularized models
Authors:Wenjing Li, Wei Bian, Kim-Chuan Toh
Abstract: Rank regularized minimization problem is an ideal model for the low-rank matrix completion/recovery problem. The matrix factorization approach can transform the high-dimensional rank regularized problem to a low-dimensional factorized column-sparse regularized problem. The latter can greatly facilitate fast computations in applicable algorithms, but needs to overcome the simultaneous non-convexity of the loss and regularization functions. In this paper, we consider the factorized column-sparse regularized model. Firstly, we optimize this model with bound constraints, and establish a certain equivalence between the optimized factorization problem and rank regularized problem. Further, we strengthen the optimality condition for stationary points of the factorization problem and define the notion of strong stationary point. Moreover, we establish the equivalence between the factorization problem and its a nonconvex relaxation in the sense of global minimizers and strong stationary points. To solve the factorization problem, we design two types of algorithms and give an adaptive method to reduce their computation. The first algorithm is from the relaxation point of view and its iterates own some properties from global minimizers of the factorization problem after finite iterations. We give some analysis on the convergence of its iterates to the strong stationary point. The second algorithm is designed for directly solving the factorization problem. We improve the PALM algorithm introduced by Bolte et al. (Math Program Ser A 146:459-494, 2014) for the factorization problem and give its improved convergence results. Finally, we conduct numerical experiments to show the promising performance of the proposed model and algorithms for low-rank matrix completion.
5.An Efficient Framework for Global Non-Convex Polynomial Optimization over the Hypercube
Authors:Pierre-David Letourneau, Dalton Jones, Matthew Morse, M. Harper Langston
Abstract: We present a novel efficient theoretical and numerical framework for solving global non-convex polynomial optimization problems. We analytically demonstrate that such problems can be efficiently reformulated using a non-linear objective over a convex set; further, these reformulated problems possess no spurious local minima (i.e., every local minimum is a global minimum). We introduce an algorithm for solving these resulting problems using the augmented Lagrangian and the method of Burer and Monteiro. We show through numerical experiments that polynomial scaling in dimension and degree is achievable for computing the optimal value and location of previously intractable global polynomial optimization problems in high dimension.
6.Moreau Envelope ADMM for Decentralized Weakly Convex Optimization
Authors:Reza Mirzaeifard, Naveen K. D. Venkategowda, Alexander Jung, Stefan Werner
Abstract: This paper proposes a proximal variant of the alternating direction method of multipliers (ADMM) for distributed optimization. Although the current versions of ADMM algorithm provide promising numerical results in producing solutions that are close to optimal for many convex and non-convex optimization problems, it remains unclear if they can converge to a stationary point for weakly convex and locally non-smooth functions. Through our analysis using the Moreau envelope function, we demonstrate that MADM can indeed converge to a stationary point under mild conditions. Our analysis also includes computing the bounds on the amount of change in the dual variable update step by relating the gradient of the Moreau envelope function to the proximal function. Furthermore, the results of our numerical experiments indicate that our method is faster and more robust than widely-used approaches.
7.A Divide and Conquer Approximation Algorithm for Partitioning Rectangles
Authors:Reyhaneh Mohammadi, Mehdi Behroozi
Abstract: Given a rectangle $R$ with area $A$ and a set of areas $L=\{A_1,...,A_n\}$ with $\sum_{i=1}^n A_i = A$, we consider the problem of partitioning $R$ into $n$ sub-regions $R_1,...,R_n$ with areas $A_1,...,A_n$ in a way that the total perimeter of all sub-regions is minimized. The goal is to create square-like sub-regions, which are often more desired. We propose a divide and conquer algorithm for this problem that finds factor $1.2$--approximate solutions in $\mathcal{O}(n\log n)$ time.
1.Variational Analysis of Kurdyka-Lojasiewicz Property by Way of Outer Limiting Subgradients
Authors:Minghua Li, Kaiwen Meng, Xiaoqi Yang
Abstract: In this paper, for a function $f$ locally lower semicontinuous at a stationary point $\bar{x}$, we obtain complete characterizations of the Kurdyka-{\L}ojasiewicz (for short, K{\L}) property and the exact estimate of the K{\L} modulus via the outer limiting subdifferential of an auxilliary function, and obtain a sufficient condition for verifying sharpness of the K{\L} exponent. By introducing a $\frac{1}{1-\theta}$-th subderivative $h$ for $f$ at $\bar{x}$, we show that the K{\L} property of $f$ at $\bar{x}$ with exponent $\theta\in [0, 1)$ can be inherited by $h$ at $0$ with the same exponent $\theta$, and that the K{\L} modulus of $f$ at $\bar{x}$ is bounded above by that of $(1-\theta)h$ at $0$. When $\theta=\frac12$, we obtain the reverse results under the strong metrically subregularity of the subgradient mapping for the class of prox-regular, twice epi-differentiable and subdifferentially continuous functions by virtue of Moreau envelopes. We apply the obtained results to establish the K{\L} property with exponent $\frac12$ and to provide calculations of the K{\L} modulus for smooth functions, the pointwise max of finitely many smooth functions and the $\ell_p$ ($0<p\leq 1$) regularized functions respectively. It is worth noting that these functions often appear in structured optimization problems.
2.A Note on Linear Quadratic Regulator and Kalman Filter
Authors:Midhun T. Augustine
Abstract: Two central problems in modern control theory are the controller design problem: which deals with designing a control law for the dynamical system, and the state estimation problem (observer design problem): which deals with computing an estimate of the states of the dynamical system. The Linear Quadratic Regulator (LQR) and Kalman Filter (KF) solves these problems respectively for linear dynamical systems in an optimal manner, i.e., LQR is an optimal state feedback controller and KF is an optimal state estimator. In this note, we will be discussing the basic concepts, derivation, steady-state analysis, and numerical implementation of the LQR and KF.
3.Design of Coherent Passive Quantum Equalizers Using Robust Control Theory
Authors:V. Ugrinovskii, M. R. James
Abstract: The paper develops a methodology for the design of coherent equalizing filters for quantum communication channels. Given a linear quantum system model of a quantum communication channel, the aim is to obtain another quantum system which, when coupled with the original system, mitigates degrading effects of the environment. The main result of the paper is a systematic equalizer synthesis algorithm which relies on methods of state-space robust control design via semidefinite programming.
4.Riemannian Optimistic Algorithms
Authors:Xi Wang, Deming Yuan, Yiguang Hong, Zihao Hu, Lei Wang, Guodong Shi
Abstract: In this paper, we consider Riemannian online convex optimization with dynamic regret. First, we propose two novel algorithms, namely the Riemannian Online Optimistic Gradient Descent (R-OOGD) and the Riemannian Adaptive Online Optimistic Gradient Descent (R-AOOGD), which combine the advantages of classical optimistic algorithms with the rich geometric properties of Riemannian manifolds. We analyze the dynamic regrets of the R-OOGD and R-AOOGD in terms of regularity of the sequence of cost functions and comparators. Next, we apply the R-OOGD to Riemannian zero-sum games, leading to the Riemannian Optimistic Gradient Descent Ascent algorithm (R-OGDA). We analyze the average iterate and best-iterate of the R-OGDA in seeking Nash equilibrium for a two-player, zero-sum, g-convex-concave games. We also prove the last-iterate convergence of the R-OGDA for g-strongly convex-strongly concave problems. Our theoretical analysis shows that all proposed algorithms achieve results in regret and convergence that match their counterparts in Euclidean spaces. Finally, we conduct several experiments to verify our theoretical findings.
5.Quasioptimal alternating projections and their use in low-rank approximation of matrices and tensors
Authors:Stanislav Budzinskiy
Abstract: We study the convergence of specific inexact alternating projections for two non-convex sets in a Euclidean space. The $\sigma$-quasioptimal metric projection ($\sigma \geq 1$) of a point $x$ onto a set $A$ consists of points in $A$ the distance to which is at most $\sigma$ times larger than the minimal distance $\mathrm{dist}(x,A)$. We prove that quasioptimal alternating projections, when one or both projections are quasioptimal, converge locally and linearly under the usual regularity assumptions on the two sets and their intersection. The theory is motivated by the successful application of alternating projections to low-rank matrix and tensor approximation. We focus on two problems -- nonnegative low-rank approximation and low-rank approximation in the maximum norm -- and develop fast alternating-projection algorithms for matrices and tensor trains based on cross approximation and acceleration techniques. The numerical experiments confirm that the proposed methods are efficient and suggest that they can be used to regularise various low-rank computational routines.
6.The Bus Rapid Transit Investment Problem
Authors:Rowan Hoogervorst, Evelien van der Hurk, Philine Schiewe, Anita Schöbel, Reena Urban
Abstract: Bus Rapid Transit (BRT) systems can provide a fast and reliable service to passengers at low investment costs compared to tram, metro and train systems. Therefore, they can be of great value to attract more passengers to use public transport. This paper thus focuses on the BRT investment problem: Which segments of a single bus line should be upgraded such that the number of newly attracted passengers is maximized? Motivated by the construction of a new BRT line around Copenhagen, we consider a setting in which multiple parties are responsible for different segments of the line. As each party has a limited willingness to invest, we solve a bi-objective problem to quantify the trade-off between the number of attracted passengers and the investment budget. We model different problem variations: First, we consider two potential passenger responses to upgrades on the line. Second, to prevent scattered upgrades along the line, we consider different restrictions on the number of upgraded connected components on the line. We propose an epsilon-constraint-based algorithm to enumerate the complete set of non-dominated points and investigate the complexity of this problem. Moreover, we perform extensive numerical experiments on artificial instances and a case study based on the BRT line around Copenhagen. Our results show that we can generate the full Pareto front for real-life instances and that the resulting trade-off between investment budget and attracted passengers depends both on the origin-destination demand and on the passenger response to upgrades. Moreover, we illustrate how the generated Pareto plots can assist decision makers in selecting from a set of geographical route alternatives in our case study.
1.A Geometric Algorithm for Maximizing the Distance over an Intersection of Balls to a Given Point
Authors:Marius Costandin, Beniamin Costandin
Abstract: In this paper the authors propose a polynomial algorithm which allows the computation of the farthest in an intersection of balls to a given point under three additional hypothesis: the farthest is unique, the distance to it is known and its magnitude is known. As a use case the authors analyze the subset sum problem SSP(S,T) for a given $S\in \mathbb{R}^n$ and $T \in \mathbb{R}$. The proposed approach is to write the SSP as a distance maximization over an intersection of balls. It was shown that the SSP has a solution if and only if the maximum value of the distance has a predefined value. This together with the fact that a solution is a corner of the unit hypercube, allows the authors to apply the proposed geometry results to find a solution to the SSP under the hypothesis that is unique.
2.Frequency-domain criterion on the stabilizability for infinite-dimensional linear control systems
Authors:Karl Kunisch, Gengsheng Wang, Huaiqiang Yu
Abstract: A quantitative frequency-domain condition related to the exponential stabilizability for infinite-dimensional linear control systems is presented. It is proven that this condition is necessary and sufficient for the stabilizability of special systems, while it is a necessary condition for the stabilizability in general. Applications are provided.
3.The Agricultural Spraying Vehicle Routing Problem With Splittable Edge Demands
Authors:Qian Wan, Rodolfo García-Flores, Simon A. Bowly, Philip Kilby, Andreas T. Ernst
Abstract: In horticulture, spraying applications occur multiple times throughout any crop year. This paper presents a splittable agricultural chemical sprayed vehicle routing problem and formulates it as a mixed integer linear program. The main difference from the classical capacitated arc routing problem (CARP) is that our problem allows us to split the demand on a single demand edge amongst robotics sprayers. We are using theoretical insights about the optimal solution structure to improve the formulation and provide two different formulations of the splittable capacitated arc routing problem (SCARP), a basic spray formulation and a large edge demands formulation for large edge demands problems. This study presents solution methods consisting of lazy constraints, symmetry elimination constraints, and a heuristic repair method. Computational experiments on a set of valuable data based on the properties of real-world agricultural orchard fields reveal that the proposed methods can solve the SCARP with different properties. We also report computational results on classical benchmark sets from previous CARP literature. The tested results indicated that the SCARP model can provide cheaper solutions in some instances when compared with the classical CARP literature. Besides, the heuristic repair method significantly improves the quality of the solution by decreasing the upper bound when solving large-scale problems.
4.Limited memory gradient methods for unconstrained optimization
Authors:Giulia Ferrandi, Michiel E. Hochstenbach
Abstract: The limited memory steepest descent method (Fletcher, 2012) for unconstrained optimization problems stores a few past gradients to compute multiple stepsizes at once. We review this method and propose new variants. For strictly convex quadratic objective functions, we study the numerical behavior of different techniques to compute new stepsizes. In particular, we introduce a method to improve the use of harmonic Ritz values. We also show the existence of a secant condition associated with LMSD, where the approximating Hessian is projected onto a low-dimensional space. In the general nonlinear case, we propose two new alternatives to Fletcher's method: first, the addition of symmetry constraints to the secant condition valid for the quadratic case; second, a perturbation of the last differences between consecutive gradients, to satisfy multiple secant equations simultaneously. We show that Fletcher's method can also be interpreted from this viewpoint.
5.Uniform Turnpike Property and Singular Limits
Authors:Martin Hernandez, Enrique Zuazua
Abstract: Motivated by singular limits for long-time optimal control problems, we investigate a class of parameter-dependent parabolic equations. First, we prove a turnpike result, uniform with respect to the parameters within a suitable regularity class and under appropriate bounds. The main ingredient of our proof is the justification of the uniform exponential stabilization of the corresponding Riccati equations, which is derived from the uniform null control properties of the model. Then, we focus on a heat equation with rapidly oscillating coefficients. In the one-dimensional setting, we obtain a uniform turnpike property with respect to the highly oscillatory heterogeneous medium. Afterward, we establish the homogenization of the turnpike property. Finally, our results are validated by numerical experiments.
6.Energy Space Newton Differentiability for Solution Maps of Unilateral and Bilateral Obstacle Problems
Authors:Constantin Christof, Gerd Wachsmuth
Abstract: We prove that the solution operator of the classical unilateral obstacle problem on a nonempty open bounded set $\Omega \subset \mathbb{R}^d$, $d \in \mathbb{N}$, is Newton differentiable as a function from $L^p(\Omega)$ to $H_0^1(\Omega)$ whenever $\max(1, 2d/(d+2)) < p \leq \infty$. By exploiting this Newton differentiability property, results on angled subspaces in $H^{-1}(\Omega)$, and a formula for orthogonal projections onto direct sums, we further show that the solution map of the classical bilateral obstacle problem is Newton differentiable as a function from $L^p(\Omega)$ to $H_0^1(\Omega)\cap L^q(\Omega)$ whenever $\max(1, d/2) < p \leq \infty$ and $1 \leq q <\infty$. For both the unilateral and the bilateral case, we provide explicit formulas for the Newton derivative. As a concrete application example for our results, we consider the numerical solution of an optimal control problem with $H_0^1(\Omega)$-controls and box-constraints by means of a semismooth Newton method.
7.Second-order methods for quartically-regularised cubic polynomials, with applications to high-order tensor methods
Authors:Coralia Cartis, Wenqi Zhu
Abstract: There has been growing interest in high-order tensor methods for nonconvex optimization, with adaptive regularization, as they possess better/optimal worst-case evaluation complexity globally and faster convergence asymptotically. These algorithms crucially rely on repeatedly minimizing nonconvex multivariate Taylor-based polynomial sub-problems, at least locally. Finding efficient techniques for the solution of these sub-problems, beyond the second-order case, has been an open question. This paper proposes a second-order method, Quadratic Quartic Regularisation (QQR), for efficiently minimizing nonconvex quartically-regularized cubic polynomials, such as the AR$p$ sub-problem [3] with $p=3$. Inspired by [35], QQR approximates the third-order tensor term by a linear combination of quadratic and quartic terms, yielding (possibly nonconvex) local models that are solvable to global optimality. In order to achieve accuracy $\epsilon$ in the first-order criticality of the sub-problem, we show that the error in the QQR method decreases either linearly or by at least $\mathcal{O}(\epsilon^{4/3})$ for locally convex iterations, while in the sufficiently nonconvex case, by at least $\mathcal{O}(\epsilon)$; thus improving, on these types of iterations, the general cubic-regularization bound. Preliminary numerical experiments indicate that two QQR variants perform competitively with state-of-the-art approaches such as ARC (also known as AR$p$ with $p=2$), achieving either a lower objective value or iteration counts.
8.Gauss-Newton oriented greedy algorithms for the reconstruction of operators in nonlinear dynamics
Authors:S. Buchwald, G. Ciaramella, J. Salomon
Abstract: This paper is devoted to the development and convergence analysis of greedy reconstruction algorithms based on the strategy presented in [Y. Maday and J. Salomon, Joint Proceedings of the 48th IEEE Conference on Decision and Control and the 28th Chinese Control Conference, 2009, pp. 375--379]. These procedures allow the design of a sequence of control functions that ease the identification of unknown operators in nonlinear dynamical systems. The original strategy of greedy reconstruction algorithms is based on an offline/online decomposition of the reconstruction process and an ansatz for the unknown operator obtained by an a priori chosen set of linearly independent matrices. In the previous work [S. Buchwald, G. Ciaramella and J. Salomon, SIAM J. Control Optim., 59(6), pp. 4511-4537], convergence results were obtained in the case of linear identification problems. We tackle here the more general case of nonlinear systems. More precisely, we introduce a new greedy algorithm based on the linearized system. Then, we show that the controls obtained with this new algorithm lead to the local convergence of the classical Gauss-Newton method applied to the online nonlinear identification problem. We then extend this result to the controls obtained on nonlinear systems where a local convergence result is also proved. The main convergence results are obtained for the reconstruction of drift operators in dynamical systems with linear and bilinear control structures.
1.The Nesterov-Spokoiny Acceleration: $o(1/k^2)$ Convergence without Proximal Operations
Authors:Weibin Peng, Tianyu Wang
Abstract: This paper studies a variant of an accelerated gradient algorithm of Nesterov and Spokoiny. We call this algorithm the Nesterov-Spokoiny Acceleration (NSA). The NSA algorithm satisfies the following properties for smooth convex programs, 1. The sequence $\{ \mathbf{x}_k \}_{k \in \mathbb{N}} $ governed by the NSA satisfies $ \limsup\limits_{k \to \infty } k^2 ( f (\mathbf{x}_k ) - f^* ) = 0 $, where $f^* > -\infty$ is the minimum of the smooth convex function $f$. 2. The sequence $\{ \mathbf{x}_k \}_{k \in \mathbb{N}} $ governed by the NSA satisfies $ \liminf\limits_{k \to \infty } k^2 \log k \log\log k ( f (\mathbf{x}_k ) - f^* ) = 0 $. 3. The sequence $\{ \mathbf{y}_k \}_{k \in \mathbb{N}} $ governed by NSA satisfies $ \liminf\limits_{k \to \infty } k^3 \log k \log\log k \| \nabla f ( \mathbf{y}_k ) \|^2 = 0 $. Item 1 above is perhaps more important than items 2 and 3: For general smooth convex programs, NSA is the first gradient algorithm that achieves $o(k^{-2})$ convergence rate without proximal operations. Some extensions of the NSA algorithm are also studied. Also, our study on a zeroth-order variant of NSA shows that $o(1/k^2)$ convergence can be achieved via estimated gradient.
2.General Discrete-Time Fokker-Planck Control by Power Moments
Authors:Guangyu Wu, Anders Lindquist
Abstract: In this paper, we address the so-called general Fokker-Planck control problem for discrete-time first-order linear systems. Unlike conventional treatments, we don't assume the distributions of the system states to be Gaussian. Instead, we only assume the existence and finiteness of the first several order power moments of the distributions. It is proved in the literature that there doesn't exist a solution, which has a form of conventional feedback control, to this problem. We propose a moment representation of the system to turn the original problem into a finite-dimensional one. Then a novel feedback control term, which is a mixture of a feedback term and a Markovian transition kernel term is proposed to serve as the control input of the moment system. The states of the moment system are obtained by maximizing the smoothness of the state transition. The power moments of the transition kernels are obtained by a convex optimization problem, of which the solution is proved to exist and be unique. Then they are mapped back to the probability distributions. The control inputs to the original system are then obtained by sampling from the realized distributions. Simulation results are provided to validate our algorithm in treating the general discrete-time Fokker-Planck control problem.
3.Calculation of Dispatchable Region for Renewables with Advanced Computational Techniques
Authors:Bin Liu, Thomas Brinsmead, Stefan Westerlund, Robert Davy
Abstract: Dispatchable region for renewables (DRR) depicts a space for renewables that a power system operator can manage by dispatching controllable resources. The DRR can be used to evaluate the distance from an operating point to a secure boundary and identify ramping events with the highest risk. However, existing approaches based on MILP reformulation or iteration-based LP algorithms may be computationally challenging. This paper investigates if advanced computation techniques, including high-performance computing and parallel computing techniques, can improve the computational performance.
4.On the identification of ARMA graphical models
Authors:Mattia Zorzi
Abstract: The paper considers the problem to estimate a graphical model corresponding to an autoregressive moving-average (ARMA) Gaussian stochastic process. We propose a new maximum entropy covariance and cepstral extension problem and we show that the problem admits an approximate solution which represents an ARMA graphical model whose topology is determined by the selected entries of the covariance lags considered in the extension problem. Then, we show how the corresponding dual problem is connected with the maximum likelihood principle. Such connection allows to design a Bayesian model and characterize an approximate maximum a posteriori estimator of the ARMA graphical model in the case the graph topology is unknown. We test the performance of the proposed method through some numerical experiments.
5.An iterative conditional dispatch algorithm for the dynamic dispatch waves problem
Authors:Leon Lan, Jasper van Doorn, Niels A. Wouda, Arpan Rijal, Sandjai Bhulai
Abstract: A challenge in same-day delivery operations is that delivery requests are typically not known beforehand, but are instead revealed dynamically during the day. This uncertainty introduces a trade-off between dispatching vehicles to serve requests as soon as they are revealed to ensure timely delivery, and delaying the dispatching decision to consolidate routing decisions with future, currently unknown requests. In this paper we study the dynamic dispatch waves problem, a same-day delivery problem in which vehicles are dispatched at fixed decision moments. At each decision moment, the system operator must decide which of the known requests to dispatch, and how to route these dispatched requests. The operator's goal is to minimize the total routing cost while ensuring all requests are served on time. We propose iterative conditional dispatch (ICD), an iterative solution construction procedure based on a sample scenario approach. ICD iteratively solves sample scenarios to classify requests to be dispatched, postponed, or undecided. The set of undecided requests shrinks in each iteration until a final dispatching decision is made in the last iteration We develop two variants of ICD: one variant based on thresholds, and another variant based on similarity. A significant strength of ICD is that it is conceptually simple and easy to implement. This simplicity does not harm performance: through rigorous numerical experiments, we show that both variants efficiently navigate the large state and action spaces of the dynamic dispatch waves problem and quickly converge to a high-quality solution. In particular, the threshold-based ICD variant improves over a greedy myopic strategy by 27.2% on average, and outperforms methods from the literature by 0.8% on average, and up to 1.5% in several cases.
6.On the interplay between pricing, competition and QoS in ride-hailing
Authors:Tushar Shankar Walunj, Shiksha Singhal, Jayakrishnan Nair, Veeraruna Kavitha
Abstract: We analyse a non-cooperative game between two competing ride-hailing platforms, each of which is modeled as a two-sided queueing system, where drivers (with a limited level of patience) are assumed to arrive according to a Poisson process at a fixed rate, while the arrival process of (price-sensitive) passengers is split across the two platforms based on Quality of Service (QoS) considerations. As a benchmark, we also consider a monopolistic scenario, where each platform gets half the market share irrespective of its pricing strategy. The key novelty of our formulation is that the total market share is fixed across the platforms. The game thus captures the competition between the platforms over market share, with pricing being the lever used by each platform to influence its share of the market. The market share split is modeled via two different QoS metrics: (i) probability that an arriving passenger gets a ride (driver availability), and (ii) probability that an arriving passenger gets an acceptable ride (driver availability and acceptable price). The platform aims to maximize the rate of revenue generated from matching drivers and passengers. In each of the above settings, we analyse the equilibria associated with the game in a certain limiting regime, where driver patience is scaled to infinity. We also show that these equilibria remain relevant in the more practically meaningful `pre-limit,' where drivers are highly (but not infinitely) patient. Interestingly, under the second QoS metric, we show that for a certain range of system parameters, no pure Nash equilibrium exists. Instead, we demonstrate a novel solution concept called an \textit{equilibrium cycle}, which has interesting dynamic connotations. Our results highlight the interplay between competition, passenger-side price sensitivity, and passenger/driver arrival rates.
7.Stochastic optimal control problems with delays in the state and in the control via viscosity solutions and an economical application
Authors:Filippo de Feo
Abstract: In this manuscript we consider optimal control problems of deterministic and stochastic differential equations with delays in the state and in the control. First we prove an equivalent Markovian reformulation on Hilbert spaces of the state equation. Then, using the dynamic programming approach for infinite-dimensional systems, we prove that the value function is the unique viscosity solution of the infinite-dimensional Hamilton Jacobi Bellman equation. Finally we apply this result to a stochastic optimal advertising problem with delays in the state and in the control.
8.Strict Dissipativity and turnpike for LQ Optimal Control Problems with Possibly Boundary Reference
Authors:Zhuqing Li, Roberto Guglielmi
Abstract: In this paper we investigate the turnpike property for constrained LQ optimal control problem in connection with dissipativity of the control system. We determine sufficient conditions to ensure the turnpike property in the case of a turnpike reference possibly occurring on the boundary of the state constraint set.
9.A real moment-HSOS hierarchy for complex polynomial optimization with real coefficients
Authors:Jie Wang, Victor Magron
Abstract: This paper proposes a real moment-HSOS hierarchy for complex polynomial optimization problems with real coefficients. We show that this hierarchy provides the same sequence of lower bounds as the complex analogue, yet is much cheaper to solve. In addition, we prove that global optimality is achieved when the ranks of the moment matrix and certain submatrix equal two in case that a sphere constraint is present, and as a consequence, the complex polynomial optimization problem has either two real optimal solutions or a pair of conjugate optimal solutions. A simple procedure for extracting a pair of conjugate optimal solutions is given in the latter case. Various numerical examples are presented to demonstrate the efficiency of this new hierarchy, and an application to polyphase code design is also provided.
10.Minimizing Quasi-Self-Concordant Functions by Gradient Regularization of Newton Method
Authors:Nikita Doikov
Abstract: We study the composite convex optimization problems with a Quasi-Self-Concordant smooth component. This problem class naturally interpolates between classic Self-Concordant functions and functions with Lipschitz continuous Hessian. Previously, the best complexity bounds for this problem class were associated with trust-region schemes and implementations of a ball-minimization oracle. In this paper, we show that for minimizing Quasi-Self-Concordant functions we can use instead the basic Newton Method with Gradient Regularization. For unconstrained minimization, it only involves a simple matrix inversion operation (solving a linear system) at each step. We prove a fast global linear rate for this algorithm, matching the complexity bound of the trust-region scheme, while our method remains especially simple to implement. Then, we introduce the Dual Newton Method, and based on it, develop the corresponding Accelerated Newton Scheme for this problem class, which further improves the complexity factor of the basic method. As a direct consequence of our results, we establish fast global linear rates of simple variants of the Newton Method applied to several practical problems, including Logistic Regression, Soft Maximum, and Matrix Scaling, without requiring additional assumptions on strong or uniform convexity for the target objective.
11.Matheuristic for Vehicle Routing Problem with Multiple Synchronization Constraints and Variable Service Time
Authors:Faisal Alkaabneh, Rabiatu Bonku
Abstract: This paper considers an extension of the vehicle routing problem with synchronization constraints and introduces the vehicle routing problem with multiple synchronization constraints and variable service time. This important problem is motivated by a real-world problem faced by one of the largest agricultural companies in the world providing precision agriculture services to their clients who are farmers and growers. The solution to this problem impacts the performance of farm spraying operations and can help design policies to improve spraying operations in large-scale farming. We propose a Mixed Integer Programming (MIP) model for this challenging problem, along with problem-specific valid inequalities. A three-phase powerful matheuristic is proposed to solve large instances enhanced with a novel local search method. We conduct extensive numerical analysis using realistic data. Results show that our matheuristic is fast and efficient in terms of solution quality and computational time compared to the state-of-the-art MIP solver. Using real-world data, we demonstrate the importance of considering an optimization approach to solve the problem, showing that the policy implemented in practice overestimates the costs by 15-20%. Finally, we compare and contrast the impact of various decision-maker preferences on several key performance metrics by comparing different mathematical models.
1.Optimal Planning in Habit Formation Models with Multiple Goods
Authors:Mauro Bambi, Daria Ghilli, Fausto Gozzi, Marta Leocata
Abstract: In this paper, on the line e.g. of [COW00]) we investigate a model with habit formation and two types of substitute goods. Such family of models, even in the case of 1 good, are difficult to study since their utility function is not concave in the interesting cases (see e.g. [BG20]), hence the first order conditions are not sufficient. We introduce and explain the model and provide some first results using the dynamic programming approach. Such results will form a solid ground over which a deep study of the features of the solutions can be performed.
2.A Fast Minimization Algorithm for the Euler Elastica Model Based on a Bilinear Decomposition
Authors:Zhifang Liu, Baochen Sun, Xue-Cheng Tai, Qi Wang, Huibin Chang
Abstract: The Euler Elastica (EE) model with surface curvature can generate artifact-free results compared with the traditional total variation regularization model in image processing. However, strong nonlinearity and singularity due to the curvature term in the EE model pose a great challenge for one to design fast and stable algorithms for the EE model. In this paper, we propose a new, fast, hybrid alternating minimization (HALM) algorithm for the EE model based on a bilinear decomposition of the gradient of the underlying image and prove the global convergence of the minimizing sequence generated by the algorithm under mild conditions. The HALM algorithm comprises three sub-minimization problems and each is either solved in the closed form or approximated by fast solvers making the new algorithm highly accurate and efficient. We also discuss the extension of the HALM strategy to deal with general curvature-based variational models, especially with a Lipschitz smooth functional of the curvature. A host of numerical experiments are conducted to show that the new algorithm produces good results with much-improved efficiency compared to other state-of-the-art algorithms for the EE model. As one of the benchmarks, we show that the average running time of the HALM algorithm is at most one-quarter of that of the fast operator-splitting-based Deng-Glowinski-Tai algorithm.
1.Convex envelopes of bounded monomials on two-variable cones
Authors:Pietro Belotti
Abstract: We consider an $n$-variate monomial function that is restricted both in value by lower and upper bounds and in domain by two homogeneous linear inequalities. Such functions are building blocks of several problems found in practical applications, and that fall under the class of Mixed Integer Nonlinear Optimization. We show that the upper envelope of the function in the given domain, for $n\ge 2$ is given by a conic inequality. We also present the lower envelope for $n=2$. To assess the applicability of branching rules based on homogeneous linear inequalities, we also derive the volume of the convex hull for $n=2$.
2.A Distributed Linear Quadratic Discrete-Time Game Approach to Formation Control with Collision Avoidance
Authors:Prima Aditya, Herbert Werner
Abstract: Formation control problems can be expressed as linear quadratic discrete-time games (LQDTG) for which Nash equilibrium solutions are sought. However, solving such problems requires solving coupled Riccati equations, which cannot be done in a distributed manner. A recent study showed that a distributed implementation is possible for a consensus problem when fictitious agents are associated with edges in the network graph rather than nodes. This paper proposes an extension of this approach to formation control with collision avoidance, where collision is precluded by including appropriate penalty terms on the edges. To address the problem, a state-dependent Riccati equation needs to be solved since the collision avoidance term in the cost function leads to a state-dependent weight matrix. This solution provides relative control inputs associated with the edges of the network graph. These relative inputs then need to be mapped to the physical control inputs applied at the nodes; this can be done in a distributed manner by iterating over a gradient descent search between neighbors in each sampling interval. Unlike inter-sample iteration frequently used in distributed MPC, only a matrix-vector multiplication is needed for each iteration step here, instead of an optimization problem to be solved. This approach can be implemented in a receding horizon manner, this is demonstrated through a numerical example.
1.Solving Elliptic Optimal Control Problems using Physics Informed Neural Networks
Authors:Bangti Jin, Ramesh Sau, Luowei Yin, Zhi Zhou
Abstract: In this work, we present and analyze a numerical solver for optimal control problems (without / with box constraint) for linear and semilinear second-order elliptic problems. The approach is based on a coupled system derived from the first-order optimality system of the optimal control problem, and applies physics informed neural networks (PINNs) to solve the coupled system. We present an error analysis of the numerical scheme, and provide $L^2(\Omega)$ error bounds on the state, control and adjoint state in terms of deep neural network parameters (e.g., depth, width, and parameter bounds) and the number of sampling points in the domain and on the boundary. The main tools in the analysis include offset Rademacher complexity and boundedness and Lipschitz continuity of neural network functions. We present several numerical examples to illustrate the approach and compare it with three existing approaches.
2.Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-Łojasiewicz condition
Authors:Hyung Jun Choi, Woocheol Choi, Jinmyoung Seok
Abstract: In this work, we establish the linear convergence estimate for the gradient descent involving the delay $\tau\in\mathbb{N}$ when the cost function is $\mu$-strongly convex and $L$-smooth. This result improves upon the well-known estimates in Arjevani et al. \cite{ASS} and Stich-Karmireddy \cite{SK} in the sense that it is non-ergodic and is still established in spite of weaker constraint of cost function. Also, the range of learning rate $\eta$ can be extended from $\eta\leq 1/(10L\tau)$ to $\eta\leq 1/(4L\tau)$ for $\tau =1$ and $\eta\leq 3/(10L\tau)$ for $\tau \geq 2$, where $L >0$ is the Lipschitz continuity constant of the gradient of cost function. In a further research, we show the linear convergence of cost function under the Polyak-{\L}ojasiewicz\,(PL) condition, for which the available choice of learning rate is further improved as $\eta\leq 9/(10L\tau)$ for the large delay $\tau$. Finally, some numerical experiments are provided in order to confirm the reliability of the analyzed results.
3.An Accelerated Block Proximal Framework with Adaptive Momentum for Nonconvex and Nonsmooth Optimization
Authors:Weifeng Yang, Wenwen Min
Abstract: We propose an accelerated block proximal linear framework with adaptive momentum (ABPL$^+$) for nonconvex and nonsmooth optimization. We analyze the potential causes of the extrapolation step failing in some algorithms, and resolve this issue by enhancing the comparison process that evaluates the trade-off between the proximal gradient step and the linear extrapolation step in our algorithm. Furthermore, we extends our algorithm to any scenario involving updating block variables with positive integers, allowing each cycle to randomly shuffle the update order of the variable blocks. Additionally, under mild assumptions, we prove that ABPL$^+$ can monotonically decrease the function value without strictly restricting the extrapolation parameters and step size, demonstrates the viability and effectiveness of updating these blocks in a random order, and we also more obviously and intuitively demonstrate that the derivative set of the sequence generated by our algorithm is a critical point set. Moreover, we demonstrate the global convergence as well as the linear and sublinear convergence rates of our algorithm by utilizing the Kurdyka-Lojasiewicz (K{\L}) condition. To enhance the effectiveness and flexibility of our algorithm, we also expand the study to the imprecise version of our algorithm and construct an adaptive extrapolation parameter strategy, which improving its overall performance. We apply our algorithm to multiple non-negative matrix factorization with the $\ell_0$ norm, nonnegative tensor decomposition with the $\ell_0$ norm, and perform extensive numerical experiments to validate its effectiveness and efficiency.
4.Data-driven decision-focused surrogate modeling
Authors:Rishabh Gupta, Qi Zhang
Abstract: We introduce the concept of decision-focused surrogate modeling for solving computationally challenging nonlinear optimization problems in real-time settings. The proposed data-driven framework seeks to learn a simpler, e.g. convex, surrogate optimization model that is trained to minimize the decision prediction error, which is defined as the difference between the optimal solutions of the original and the surrogate optimization models. The learning problem, formulated as a bilevel program, can be viewed as a data-driven inverse optimization problem to which we apply a decomposition-based solution algorithm from previous work. We validate our framework through numerical experiments involving the optimization of common nonlinear chemical processes such as chemical reactors, heat exchanger networks, and material blending systems. We also present a detailed comparison of decision-focused surrogate modeling with standard data-driven surrogate modeling methods and demonstrate that our approach is significantly more data-efficient while producing simple surrogate models with high decision prediction accuracy.
5.Funnel MPC for nonlinear systems with arbitrary relative degree
Authors:Thomas Berger, Dario Dennstädt
Abstract: The Model Predictive Control (MPC) scheme Funnel MPC enables output tracking of smooth reference signals with prescribed error bounds for nonlinear multi-input multi-output systems with stable internal dynamics. Earlier works achieved the control objective for system with relative degree restricted to one or incorporated additional feasibility constraints in the optimal control problem. Here we resolve these limitations by introducing a modified stage cost function relying on a weighted sum of the tracking error derivatives. The weights need to be sufficiently large and we state explicit lower bounds. Under these assumptions we are able to prove initial and recursive feasibility of the novel Funnel MPC scheme for systems with arbitrary relative degree - without requiring any terminal conditions, a sufficiently long prediction horizon or additional output constraints.
1.Distorted optimal transport
Authors:Haiyan Liu, Bin Wang, Ruodu Wang, Sheng Chao Zhuang
Abstract: Classic optimal transport theory is built on minimizing the expected cost between two given distributions. We propose the framework of distorted optimal transport by minimizing a distorted expected cost. This new formulation is motivated by concrete problems in decision theory, robust optimization, and risk management, and it has many distinct features compared to the classic theory. We choose simple cost functions and study different distortion functions and their implications on the optimal transport plan. We show that on the real line, the comonotonic coupling is optimal for the distorted optimal transport problem when the distortion function is convex and the cost function is submodular and monotone. Some forms of duality and uniqueness results are provided. For inverse-S-shaped distortion functions and linear cost, we obtain the unique form of optimal coupling for all marginal distributions, which turns out to have an interesting ``first comonotonic, then counter-monotonic" dependence structure; for S-shaped distortion functions a similar structure is obtained. Our results highlight several challenges and features in distorted optimal transport, offering a new mathematical bridge between the fields of probability, decision theory, and risk management.
2.A Tight Formulation for the Dial-a-Ride Problem
Authors:Daniela Gaul, Kathrin Klamroth, Christian Pfeiffer, Arne Schulz, Michael Stiglmayr
Abstract: Ridepooling services play an increasingly important role in modern transportation systems. With soaring demand and growing fleet sizes, the underlying route planning problems become increasingly challenging. In this context, we consider the dial-a-ride problem (DARP): Given a set of transportation requests with pick-up and delivery locations, passenger numbers, time windows, and maximum ride times, an optimal routing for a fleet of vehicles, including an optimized passenger assignment, needs to be determined. We present tight mixed-integer linear programming (MILP) formulations for the DARP by combining two state-of-the-art models into novel location-augmented-event-based formulations. Strong valid inequalities and lower and upper bounding techniques are derived to further improve the formulations. We then demonstrate the theoretical and computational superiority of the new model: First, the formulation is tight in the sense that, if time windows shrink to a single point in time, the linear programming relaxation yields integer (and hence optimal) solutions. Second, extensive numerical experiments on benchmark instances show that computational times are on average reduced by 49.7% compared to state-of-the-art event-based approaches.
3.Reproducing kernel approach to linear quadratic mean field control problems
Authors:Pierre-Cyril Aubin-Frankowski, Alain Bensoussan
Abstract: Mean-field control problems have received continuous interest over the last decade. Despite being more intricate than in classical optimal control, the linear-quadratic setting can still be tackled through Riccati equations. Remarkably, we demonstrate that another significant attribute extends to the mean-field case: the existence of an intrinsic reproducing kernel Hilbert space associated with the problem. Our findings reveal that this Hilbert space not only encompasses deterministic controlled push-forward mappings but can also represent of stochastic dynamics. Specifically, incorporating Brownian noise affects the deterministic kernel through a conditional expectation, to make the trajectories adapted. Introducing reproducing kernels allows us to rewrite the mean-field control problem as optimizing over a Hilbert space of trajectories rather than controls. This framework even accommodates nonlinear terminal costs, without resorting to adjoint processes or Pontryagin's maximum principle, further highlighting the versatility of the proposed methodology.
4.Iterative risk-constrained model predictive control: A data-driven distributionally robust approach
Authors:Alireza Zolanvari, Ashish Cherukuri
Abstract: This paper proposes an iterative distributionally robust model predictive control (MPC) scheme to solve a risk-constrained infinite-horizon optimal control problem. In each iteration, the algorithm generates a trajectory from the starting point to the target equilibrium state with the aim of respecting risk constraints with high probability (that encodes safe operation of the system) and improving the cost of the trajectory as compared to previous iterations. At the end of each iteration, the visited states and observed samples of the uncertainty are stored and accumulated with the previous observations. For each iteration, the states stored previously are considered as terminal constraints of the MPC scheme, and samples obtained thus far are used to construct distributionally robust risk constraints. As iterations progress, more data is obtained and the environment is explored progressively to ensure better safety and cost optimality. We prove that the MPC scheme in each iteration is recursively feasible and the resulting trajectories converge asymptotically to the target while ensuring safety with high probability. We identify conditions under which the cost-to-go reduces as iterations progress. For systems with locally one-step reachable target, we specify scenarios that ensure finite-time convergence of iterations. We provide computationally tractable reformulations of the risk constraints for total variation and Wasserstein distance-based ambiguity sets. A simulation example illustrates the application of our results in finding a risk-constrained path for two mobile robots facing an uncertain obstacle.
5.Risk-Minimizing Two-Player Zero-Sum Stochastic Differential Game via Path Integral Control
Authors:Apurva Patil, Yujing Zhou, David Fridovich-Keil, Takashi Tanaka
Abstract: This paper addresses a continuous-time risk-minimizing two-player zero-sum stochastic differential game (SDG), in which each player aims to minimize its probability of failure. Failure occurs in the event when the state of the game enters into predefined undesirable domains, and one player's failure is the other's success. We derive a sufficient condition for this game to have a saddle-point equilibrium and show that it can be solved via a Hamilton-Jacobi-Isaacs (HJI) partial differential equation (PDE) with Dirichlet boundary condition. Under certain assumptions on the system dynamics and cost function, we establish the existence and uniqueness of the saddle-point of the game. We provide explicit expressions for the saddle-point policies which can be numerically evaluated using path integral control. This allows us to solve the game online via Monte Carlo sampling of system trajectories. We implement our control synthesis framework on two classes of risk-minimizing zero-sum SDGs: a disturbance attenuation problem and a pursuit-evasion game. Simulation studies are presented to validate the proposed control synthesis framework.
6.Decision-Making for Land Conservation: A Derivative-Free Optimization Framework with Nonlinear Inputs
Authors:Cassidy K. Buhler, Hande Y. Benson
Abstract: Protected areas (PAs) are designated spaces where human activities are restricted to preserve critical habitats. Decision-makers are challenged with balancing a trade-off of financial feasibility with ecological benefit when establishing PAs. Given the long-term ramifications of these decisions and the constantly shifting environment, it is crucial that PAs are carefully selected with long-term viability in mind. Using AI tools like simulation and optimization is common for designating PAs, but current decision models are primarily linear. In this paper, we propose a derivative-free optimization framework paired with a nonlinear component, population viability analysis (PVA). Formulated as a mixed integer nonlinear programming (MINLP) problem, our model allows for linear and nonlinear inputs. Connectivity, competition, crowding, and other similar concerns are handled by the PVA software, rather than expressed as constraints of the optimization model. In addition, we present numerical results that serve as a proof of concept, showing our models yield PAs with similar expected risk to that of preserving every parcel in a habitat, but at a significantly lower cost. The overall goal is to promote interdisciplinary work by providing a new mathematical programming tool for conservationists that allows for nonlinear inputs and can be paired with existing ecological software.
1.A relaxation method for binary orthogonal optimization problems with its applications
Authors:Lianghai Xiao, Yitian Qian, Shaohua Pan
Abstract: This paper focuses on a class of binary orthogonal optimization problems frequently arising in semantic hashing. Consider that this class of problems may have an empty feasible set, rendering them not well-defined. We introduce an equivalent model involving a restricted Stiefel manifold and a matrix box set, and then investigate its penalty problems induced by the $\ell_1$-distance from the box set and its Moreau envelope. The two penalty problems are always well-defined, and moreover, they serve as the global exact penalties provided that the original model is well-defined. Notably, the penalty problem induced by the Moreau envelope is a smooth optimization over an embedded submanifold with a favorable structure. We develop a retraction-based nonmonotone line-search Riemannian gradient method to address this penalty problem to achieve a desirable solution for the original binary orthogonal problems. Finally, the proposed method is applied to supervised and unsupervised hashing tasks and is compared with several popular methods on the MNIST and CIFAR-10 datasets. The numerical comparisons reveal that our algorithm is significantly superior to other solvers in terms of feasibility violation, and it is comparable even superior to others in terms of evaluation metrics related to the Hamming distance.
2.Universal Approximation of Parametric Optimization via Neural Networks with Piecewise Linear Policy Approximation
Authors:Hyunglip Bae, Jang Ho Kim, Woo Chang Kim
Abstract: Parametric optimization solves a family of optimization problems as a function of parameters. It is a critical component in situations where optimal decision making is repeatedly performed for updated parameter values, but computation becomes challenging when complex problems need to be solved in real-time. Therefore, in this study, we present theoretical foundations on approximating optimal policy of parametric optimization problem through Neural Networks and derive conditions that allow the Universal Approximation Theorem to be applied to parametric optimization problems by constructing piecewise linear policy approximation explicitly. This study fills the gap on formally analyzing the constructed piecewise linear approximation in terms of feasibility and optimality and show that Neural Networks (with ReLU activations) can be valid approximator for this approximation in terms of generalization and approximation error. Furthermore, based on theoretical results, we propose a strategy to improve feasibility of approximated solution and discuss training with suboptimal solutions.
3.The Unique Solvability Conditions for the Generalized Absolute Value Equations
Authors:Shubham Kumar, Deepmala
Abstract: This paper investigates the conditions that guarantee unique solvability and unsolvability for the generalized absolute value equations (GAVE) given by $Ax - B \vert x \vert = b$. Further, these conditions are also valid to determine the unique solution of the generalized absolute value matrix equations (GAVME) $AX - B \vert X \vert =F$. Finally, certain aspects related to the solvability and unsolvability of the absolute value equations (AVE) have been deliberated upon.
4.Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Authors:Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor W. Tsang, Yong Liu
Abstract: The conjugate gradient method is a crucial first-order optimization method that generally converges faster than the steepest descent method, and its computational cost is much lower than the second-order methods. However, while various types of conjugate gradient methods have been studied in Euclidean spaces and on Riemannian manifolds, there has little study for those in distributed scenarios. This paper proposes a decentralized Riemannian conjugate gradient descent (DRCGD) method that aims at minimizing a global function over the Stiefel manifold. The optimization problem is distributed among a network of agents, where each agent is associated with a local function, and communication between agents occurs over an undirected connected graph. Since the Stiefel manifold is a non-convex set, a global function is represented as a finite sum of possibly non-convex (but smooth) local functions. The proposed method is free from expensive Riemannian geometric operations such as retractions, exponential maps, and vector transports, thereby reducing the computational complexity required by each agent. To the best of our knowledge, DRCGD is the first decentralized Riemannian conjugate gradient algorithm to achieve global convergence over the Stiefel manifold.
5.Restricted inverse optimal value problem on linear programming under weighted $l_1$ norm
Authors:Junhua Jia, Xiucui Guan, Xinqiang Qian, Panos M. Pardalos
Abstract: We study the restricted inverse optimal value problem on linear programming under weighted $l_1$ norm (RIOVLP $_1$). Given a linear programming problem $LP_c: \min \{cx|Ax=b,x\geq 0\}$ with a feasible solution $x^0$ and a value $K$, we aim to adjust the vector $c$ to $\bar{c}$ such that $x^0$ becomes an optimal solution of the problem LP$_{\bar c}$ whose objective value $\bar{c}x^0$ equals $K$. The objective is to minimize the distance $\|\bar c - c\|_1=\sum_{j=1}^nd_j|\bar c_j-c_j|$ under weighted $l_1$ norm.Firstly, we formulate the problem (RIOVLP$_1$) as a linear programming problem by dual theories. Secondly, we construct a sub-problem $(D^z)$, which has the same form as $LP_c$, of the dual (RIOVLP$_1$) problem corresponding to a given value $z$. Thirdly, when the coefficient matrix $A$ is unimodular, we design a binary search algorithm to calculate the critical value $z^*$ corresponding to an optimal solution of the problem (RIOVLP$_1$). Finally, we solve the (RIOV) problems on Hitchcock and shortest path problem, respectively, in $O(T_{MCF}\log\max\{d_{max},x^0_{max},n\})$ time, where we solve a sub-problem $(D^z)$ by minimum cost flow in $T_{MCF}$ time in each iteration. The values $d_{max},x^0_{max}$ are the maximum values of $d$ and $x^0$, respectively.
6.Feedback rectifiable pairs and stabilization of switched linear systems
Authors:Maria C. Honecker, Hannes Gernandt, Kai Wulff, Carsten Trunk, Johann Reger
Abstract: We address the feedback design problem for switched linear systems. In particular we aim to design a switched state-feedback such that the resulting closed-loop switched system is in upper triangular form. To this effect we formulate and analyse the feedback rectification problem for pairs of matrices. We present necessary and sufficient conditions for the feedback rectifiability of pairs for two subsystems and give a constructive procedure to design stabilizing state-feedback for a class of switched systems. Several examples illustrate the characteristics of the problem considered and the application of the proposed constructive procedure.
7.A Homogenization Approach for Gradient-Dominated Stochastic Optimization
Authors:Jiyuan Tan, Chenyu Xue, Chuwen Zhang, Qi Deng, Dongdong Ge, Yinyu Ye
Abstract: Gradient dominance property is a condition weaker than strong convexity, yet it sufficiently ensures global convergence for first-order methods even in non-convex optimization. This property finds application in various machine learning domains, including matrix decomposition, linear neural networks, and policy-based reinforcement learning (RL). In this paper, we study the stochastic homogeneous second-order descent method (SHSODM) for gradient-dominated optimization with $\alpha \in [1, 2]$ based on a recently proposed homogenization approach. Theoretically, we show that SHSODM achieves a sample complexity of $O(\epsilon^{-7/(2 \alpha) +1})$ for $\alpha \in [1, 3/2)$ and $\tilde{O}(\epsilon^{-2/\alpha})$ for $\alpha \in [3/2, 2]$. We further provide a SHSODM with a variance reduction technique enjoying an improved sample complexity of $O( \epsilon ^{-( 7-3\alpha ) /( 2\alpha )})$ for $\alpha \in [1,3/2)$. Our results match the state-of-the-art sample complexity bounds for stochastic gradient-dominated optimization without \emph{cubic regularization}. Since the homogenization approach only relies on solving extremal eigenvector problems instead of Newton-type systems, our methods gain the advantage of cheaper iterations and robustness in ill-conditioned problems. Numerical experiments on several RL tasks demonstrate the efficiency of SHSODM compared to other off-the-shelf methods.
1.Geometric characterizations for strong minima with applications to nuclear norm minimization problems
Authors:Jalal Fadili, Tran T. A. Nghia, Duy Nhat Phan
Abstract: In this paper, we introduce several geometric characterizations for strong minima of optimization problems. Applying these results to nuclear norm minimization problems allows us to obtain new necessary and sufficient quantitative conditions for this important property. Our characterizations for strong minima are weaker than the Restricted Injectivity and Nondegenerate Source Condition, which are usually used to identify solution uniqueness of nuclear norm minimization problems. Consequently, we obtain the minimum (tight) bound on the number of measurements for (strong) exact recovery of low-rank matrices.
1.Convex Optimization-Based Model Predictive Control for the Guidance of Active Debris Removal Transfers
Authors:Minduli Wijayatunga, Roberto Armellin, Harry Holt, Laura Pirovano, Claudio Bombardelli
Abstract: Active debris removal (ADR) missions have garnered significant interest as means of mitigating collision risks in space. This work proposes a convex optimization-based model predictive control (MPC) approach to provide guidance for such missions. While convex optimization can obtain optimal solutions in polynomial time, it relies on the successive convexification of nonconvex dynamics, leading to inaccuracies. Here, the need for successive convexification is eliminated by using near-linear Generalized Equinoctial Orbital Elements (GEqOE) and by updating the reference trajectory through a new split-Edelbaum approach. The solution accuracy is then measured relative to a high-fidelity dynamics model, showing that the MPC-convex method can generate accurate solutions without iterations.
2.Learning the hub graphical Lasso model with the structured sparsity via an efficient algorithm
Authors:Chengjing Wang, Peipei Tang, Wenling He, Meixia Lin
Abstract: Graphical models have exhibited their performance in numerous tasks ranging from biological analysis to recommender systems. However, graphical models with hub nodes are computationally difficult to fit, particularly when the dimension of the data is large. To efficiently estimate the hub graphical models, we introduce a two-phase algorithm. The proposed algorithm first generates a good initial point via a dual alternating direction method of multipliers (ADMM), and then warm starts a semismooth Newton (SSN) based augmented Lagrangian method (ALM) to compute a solution that is accurate enough for practical tasks. The sparsity structure of the generalized Jacobian ensures that the algorithm can obtain a nice solution very efficiently. Comprehensive experiments on both synthetic data and real data show that it obviously outperforms the existing state-of-the-art algorithms. In particular, in some high dimensional tasks, it can save more than 70\% of the execution time, meanwhile still achieves a high-quality estimation.
3.Stabilizability for nonautonomous linear parabolic equations with actuators as distributions
Authors:Karl Kunisch, Sérgio S. Rodrigues, Daniel Walter
Abstract: The stabilizability of a general class of abstract parabolic-like equations is investigated, with a finite number of actuators. This class includes the case of actuators given as delta distributions located at given points in the spatial domain of concrete parabolic equations. A stabilizing feedback control operator is constructed and given in explicit form. Then, an associated optimal control is considered and the corresponding Riccati feedback is investigated. Results of simulations are presented showing the stabilizing performance of both explicit and Riccati feedbacks.
4.Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on GLMs and multi-index models
Authors:Elizabeth Collins-Woodfin, Courtney Paquette, Elliot Paquette, Inbar Seroussi
Abstract: We analyze the dynamics of streaming stochastic gradient descent (SGD) in the high-dimensional limit when applied to generalized linear models and multi-index models (e.g. logistic regression, phase retrieval) with general data-covariance. In particular, we demonstrate a deterministic equivalent of SGD in the form of a system of ordinary differential equations that describes a wide class of statistics, such as the risk and other measures of sub-optimality. This equivalence holds with overwhelming probability when the model parameter count grows proportionally to the number of data. This framework allows us to obtain learning rate thresholds for stability of SGD as well as convergence guarantees. In addition to the deterministic equivalent, we introduce an SDE with a simplified diffusion coefficient (homogenized SGD) which allows us to analyze the dynamics of general statistics of SGD iterates. Finally, we illustrate this theory on some standard examples and show numerical simulations which give an excellent match to the theory.
5.Progressively Strengthening and Tuning MIP Solvers for Reoptimization
Authors:Krunal Kishor Patel
Abstract: This paper explores reoptimization techniques for solving sequences of similar mixed integer programs (MIPs) more effectively. Traditionally, these MIPs are solved independently, without capitalizing on information from previously solved instances. Our approach focuses on primal bound improvements by reusing the solutions of the previously solved instances as well as dual bound improvements by reusing the branching history and automating parameter-tuning. We also describe ways to improve the solver performance by extending ideas from reliability branching to generate better pseudocosts. Our reoptimization approach, which we developed for the computational competition of the MIP 2023 workshop, earned us the first prize. In this paper, we thoroughly analyze the performance of each technique and their combined impact on the solver's performance. Finally, we present ways to extend our techniques in practice for further improvements.
6.Derivative-Free Global Minimization in One Dimension: Relaxation, Monte Carlo, and Sampling
Authors:Alexandra A. Gomes, Diogo A. Gomes
Abstract: We introduce a derivative-free global optimization algorithm that efficiently computes minima for various classes of one-dimensional functions, including non-convex, and non-smooth functions.This algorithm numerically approximates the gradient flow of a relaxed functional, integrating strategies such as Monte Carlos methods, rejection sampling, and adaptive techniques. These strategies enhance performance in solving a diverse range of optimization problems while significantly reducing the number of required function evaluations compared to established methods. We present a proof of the convergence of the algorithm and illustrate its performance by comprehensive benchmarking. The proposed algorithm offers a substantial potential for real-world models. It is particularly advantageous in situations requiring computationally intensive objective function evaluations.
7.A non-convex relaxed version of minimax theorems
Authors:M. I. A. Ghitri, A. Hantoute
Abstract: Given a subset $A\times B$ of a locally convex space $X\times Y$ (with $A$ compact) and a function $f:A\times B\rightarrow\overline{\mathbb{R}}$ such that $f(\cdot,y),$ $y\in B,$ are concave and upper semicontinuous, the minimax inequality $\max_{x\in A} \inf_{y\in B} f(x,y) \geq \inf_{y\in B} \sup_{x\in A_{0}} f(x,y)$ is shown to hold provided that $A_{0}$ be the set of $x\in A$ such that $f(x,\cdot)$ is proper, convex and lower semi-contiuous. Moreover, if in addition $A\times B\subset f^{-1}(\mathbb{R})$, then we can take as $A_{0}$ the set of $x\in A$ such that $f(x,\cdot)$ is convex. The relation to Moreau's biconjugate representation theorem is discussed, and some applications to\ convex duality are provided. Key words. Minimax theorem, Moreau theorem, conjugate function, convex optimization.
8.A DPG method for linear quadratic optimal control problems
Authors:Thomas Führer, Francisco Fuica
Abstract: The DPG method with optimal test functions for solving linear quadratic optimal control problems with control constraints is studied. We prove existence of a unique optimal solution of the nonlinear discrete problem and characterize it through first order optimality conditions. Furthermore, we systematically develop a priori as well as a posteriori error estimates. Our proposed method can be applied to a wide range of constrained optimal control problems subject to, e.g., scalar second-order PDEs and the Stokes equations. Numerical experiments that illustrate our theoretical findings are presented.
9.Linear Parameter Varying Power Regulation of Variable Speed Pitch Manipulated Wind Turbine in the Full Load Regime
Authors:T. Shaqarin, Mahmoud M. S. Al-Suod
Abstract: In a wind energy conversion system (WECS), changing the pitch angle of the wind turbine blades is a typical practice to regulate the electrical power generation in the full-load regime. Due to the turbulent nature of the wind and the large variations of the mean wind speed during the day, the rotary elements of the WECS are subjected to significant mechanical stresses and fatigue, resulting in conceivably mechanical failures and higher maintenance costs. Consequently, it is imperative to design a control system capable of handling continuous wind changes. In this work, Linear Parameter Varying (LPV) H_inf controller is used to cope with wind variations and turbulent winds with a turbulence intensity greater than 10%. The proposed controller is designed to regulate the rotational rotor speed and generator torque, thus, regulating the output power via pitch angle manipulations. In addition, a PI-Fuzzy control system is designed to be compared with the proposed control system. The closed-loop simulations of both controllers established the robustness and stability of the suggested LPV controller under large wind velocity variations, with minute power fluctuations compared to the PI-Fuzzy controller. The results show that in the presence of turbulent wind speed variations, the proposed LPV controller achieves improved transient and steady-state performance along with reduced mechanical loads in the above-rated wind speed region.
1.Stochastic Controlled Averaging for Federated Learning with Communication Compression
Authors:Xinmeng Huang, Ping Li, Xiaoyun Li
Abstract: Communication compression, a technique aiming to reduce the information volume to be transmitted over the air, has gained great interests in Federated Learning (FL) for the potential of alleviating its communication overhead. However, communication compression brings forth new challenges in FL due to the interplay of compression-incurred information distortion and inherent characteristics of FL such as partial participation and data heterogeneity. Despite the recent development, the performance of compressed FL approaches has not been fully exploited. The existing approaches either cannot accommodate arbitrary data heterogeneity or partial participation, or require stringent conditions on compression. In this paper, we revisit the seminal stochastic controlled averaging method by proposing an equivalent but more efficient/simplified formulation with halved uplink communication costs. Building upon this implementation, we propose two compressed FL algorithms, SCALLION and SCAFCOM, to support unbiased and biased compression, respectively. Both the proposed methods outperform the existing compressed FL methods in terms of communication and computation complexities. Moreover, SCALLION and SCAFCOM accommodates arbitrary data heterogeneity and do not make any additional assumptions on compression errors. Experiments show that SCALLION and SCAFCOM can match the performance of corresponding full-precision FL approaches with substantially reduced uplink communication, and outperform recent compressed FL methods under the same communication budget.
2.Learning to Pivot as a Smart Expert
Authors:Tianhao Liu, Shanwen Pu, Dongdong Ge, Yinyu Ye
Abstract: Linear programming has been practically solved mainly by simplex and interior point methods. Compared with the weakly polynomial complexity obtained by the interior point methods, the existence of strongly polynomial bounds for the length of the pivot path generated by the simplex methods remains a mystery. In this paper, we propose two novel pivot experts that leverage both global and local information of the linear programming instances for the primal simplex method and show their excellent performance numerically. The experts can be regarded as a benchmark to evaluate the performance of classical pivot rules, although they are hard to directly implement. To tackle this challenge, we employ a graph convolutional neural network model, trained via imitation learning, to mimic the behavior of the pivot expert. Our pivot rule, learned empirically, displays a significant advantage over conventional methods in various linear programming problems, as demonstrated through a series of rigorous experiments.
3.A Joint Electricity and Carbon Pricing Method
Authors:Yue Chen, Changhong Zhao
Abstract: The joint electricity and carbon pricing (JECP) problem is crucial for the low-carbon energy system transition. It is also challenging due to requirements such as providing incentives that can motivate market participants to follow the dispatch schedule and minimizing the impact on affected parties compared to when they were in the traditional electricity market. This letter proposes a novel JECP method based on partial carbon tax and primal-dual optimality conditions. Several nice properties of the proposed method are proven. Tests on different systems show its advantages over the two existing pricing methods.
4.Norm and time optimal control problems of stochastic heat equations
Authors:Yuanhang Liu, Donghui Yang, Jie Zhong
Abstract: This paper investigates the norm and time optimal control problems for stochastic heat equations. We begin by presenting a characterization of the norm optimal control, followed by a discussion of its properties. We then explore the equivalence between the norm optimal control and time optimal control, and subsequently establish the bang-bang property of the time optimal control. These problems, to the best of our knowledge, are among the first to discuss in the stochastic case.
5.SCQPTH: an efficient differentiable splitting method for convex quadratic programming
Authors:Andrew Butler
Abstract: We present SCQPTH: a differentiable first-order splitting method for convex quadratic programs. The SCQPTH framework is based on the alternating direction method of multipliers (ADMM) and the software implementation is motivated by the state-of-the art solver OSQP: an operating splitting solver for convex quadratic programs (QPs). The SCQPTH software is made available as an open-source python package and contains many similar features including efficient reuse of matrix factorizations, infeasibility detection, automatic scaling and parameter selection. The forward pass algorithm performs operator splitting in the dimension of the original problem space and is therefore suitable for large scale QPs with $100-1000$ decision variables and thousands of constraints. Backpropagation is performed by implicit differentiation of the ADMM fixed-point mapping. Experiments demonstrate that for large scale QPs, SCQPTH can provide a $1\times - 10\times$ improvement in computational efficiency in comparison to existing differentiable QP solvers.
6.Global solution and optimal control of an epidemic propagation with a heterogeneous diffusion
Authors:Pierluigi Colli, Gianni Gilardi, Gabriela Marinoschi
Abstract: In this paper, we explore the solvability and the optimal control problem for a compartmental model based on reaction-diffusion partial differential equations describing a transmissible disease. The nonlinear model takes into account the disease spreading due to the human social diffusion, under a dynamic heterogeneity in infection risk. The analysis of the resulting system provides the existence proof for a global solution and determines the conditions of optimality to reduce the concentration of the infected population in certain spatial areas.
7.A Framework for Data-Driven Explainability in Mathematical Optimization
Authors:Kevin-Martin Aigner, Marc Goerigk, Michael Hartisch, Frauke Liers, Arthur Miehlich
Abstract: Advancements in mathematical programming have made it possible to efficiently tackle large-scale real-world problems that were deemed intractable just a few decades ago. However, provably optimal solutions may not be accepted due to the perception of optimization software as a black box. Although well understood by scientists, this lacks easy accessibility for practitioners. Hence, we advocate for introducing the explainability of a solution as another evaluation criterion, next to its objective value, which enables us to find trade-off solutions between these two criteria. Explainability is attained by comparing against (not necessarily optimal) solutions that were implemented in similar situations in the past. Thus, solutions are preferred that exhibit similar features. Although we prove that already in simple cases the explainable model is NP-hard, we characterize relevant polynomially solvable cases such as the explainable shortest-path problem. Our numerical experiments on both artificial as well as real-world road networks show the resulting Pareto front. It turns out that the cost of enforcing explainability can be very small.
8.Digital twinning of cardiac electrophysiology models from the surface ECG: a geodesic backpropagation approach
Authors:Thomas Grandits, Jan Verhülsdonk, Gundolf Haase, Alexander Effland, Simone Pezzuto
Abstract: The eikonal equation has become an indispensable tool for modeling cardiac electrical activation accurately and efficiently. In principle, by matching clinically recorded and eikonal-based electrocardiograms (ECGs), it is possible to build patient-specific models of cardiac electrophysiology in a purely non-invasive manner. Nonetheless, the fitting procedure remains a challenging task. The present study introduces a novel method, Geodesic-BP, to solve the inverse eikonal problem. Geodesic-BP is well-suited for GPU-accelerated machine learning frameworks, allowing us to optimize the parameters of the eikonal equation to reproduce a given ECG. We show that Geodesic-BP can reconstruct a simulated cardiac activation with high accuracy in a synthetic test case, even in the presence of modeling inaccuracies. Furthermore, we apply our algorithm to a publicly available dataset of a rabbit model, with very positive results. Given the future shift towards personalized medicine, Geodesic-BP has the potential to help in future functionalizations of cardiac models meeting clinical time constraints while maintaining the physiological accuracy of state-of-the-art cardiac models.
9.Constrained Global Optimization by Smoothing
Authors:Vladimir Norkin, Alois Pichler, Anton Kozyriev
Abstract: This paper proposes a novel technique called "successive stochastic smoothing" that optimizes nonsmooth and discontinuous functions while considering various constraints. Our methodology enables local and global optimization, making it a powerful tool for many applications. First, a constrained problem is reduced to an unconstrained one by the exact nonsmooth penalty function method, which does not assume the existence of the objective function outside the feasible area and does not require the selection of the penalty coefficient. This reduction is exact in the case of minimization of a lower semicontinuous function under convex constraints. Then the resulting objective function is sequentially smoothed by the kernel method starting from relatively strong smoothing and with a gradually vanishing degree of smoothing. The finite difference stochastic gradient descent with trajectory averaging minimizes each smoothed function locally. Finite differences over stochastic directions sampled from the kernel estimate the stochastic gradients of the smoothed functions. We investigate the convergence rate of such stochastic finite-difference method on convex optimization problems. The "successive smoothing" algorithm uses the results of previous optimization runs to select the starting point for optimizing a consecutive, less smoothed function. Smoothing provides the "successive smoothing" method with some global properties. We illustrate the performance of the "successive stochastic smoothing" method on test-constrained optimization problems from the literature.
10.Differentiable Robust Model Predictive Control
Authors:Alex Oshin, Evangelos A. Theodorou
Abstract: Deterministic model predictive control (MPC), while powerful, is often insufficient for effectively controlling autonomous systems in the real-world. Factors such as environmental noise and model error can cause deviations from the expected nominal performance. Robust MPC algorithms aim to bridge this gap between deterministic and uncertain control. However, these methods are often excessively difficult to tune for robustness due to the nonlinear and non-intuitive effects that controller parameters have on performance. To address this challenge, a unifying perspective on differentiable optimization for control is presented, which enables derivation of a general, differentiable tube-based MPC algorithm. The proposed approach facilitates the automatic and real-time tuning of robust controllers in the presence of large uncertainties and disturbances.
11.Episodic Bayesian Optimal Control with Unknown Randomness Distributions
Authors:Alexander Shapiro, Enlu Zhou, Yifan Lin, Yuhao Wang
Abstract: Stochastic optimal control with unknown randomness distributions has been studied for a long time, encompassing robust control, distributionally robust control, and adaptive control. We propose a new episodic Bayesian approach that incorporates Bayesian learning with optimal control. In each episode, the approach learns the randomness distribution with a Bayesian posterior and subsequently solves the corresponding Bayesian average estimate of the true problem. The resulting policy is exercised during the episode, while additional data/observations of the randomness are collected to update the Bayesian posterior for the next episode. We show that the resulting episodic value functions and policies converge almost surely to their optimal counterparts of the true problem if the parametrized model of the randomness distribution is correctly specified. We further show that the asymptotic convergence rate of the episodic value functions is of the order $O(N^{-1/2})$. We develop an efficient computational method based on stochastic dual dynamic programming for a class of problems that have convex value functions. Our numerical results on a classical inventory control problem verify the theoretical convergence results and demonstrate the effectiveness of the proposed computational method.
12.Generalizing the Min-Max Regret Criterion using Ordered Weighted Averaging
Authors:Werner Baak, Marc Goerigk, Adam Kasperski, Paweł Zieliński
Abstract: In decision making under uncertainty, several criteria have been studied to aggregate the performance of a solution over multiple possible scenarios, including the ordered weighted averaging (OWA) criterion and min-max regret. This paper introduces a novel generalization of min-max regret, leveraging the modeling power of OWA to enable a more nuanced expression of preferences in handling regret values. This new OWA regret approach is studied both theoretically and numerically. We derive several properties, including polynomially solvable and hard cases, and introduce an approximation algorithm. Through computational experiments using artificial and real-world data, we demonstrate the advantages of our OWAR method over the conventional min-max regret approach, alongside the effectiveness of the proposed clustering heuristics.
1.Q-Learning for Continuous State and Action MDPs under Average Cost Criteria
Authors:Ali Devran Kara, Serdar Yuksel
Abstract: For infinite-horizon average-cost criterion problems, we present several approximation and reinforcement learning results for Markov Decision Processes with standard Borel spaces. Toward this end, (i) we first provide a discretization based approximation method for fully observed Markov Decision Processes (MDPs) with continuous spaces under average cost criteria, and we provide error bounds for the approximations when the dynamics are only weakly continuous under certain ergodicity assumptions. In particular, we relax the total variation condition given in prior work to weak continuity as well as Wasserstein continuity conditions. (ii) We provide synchronous and asynchronous Q-learning algorithms for continuous spaces via quantization, and establish their convergence. (iii) We show that the convergence is to the optimal Q values of the finite approximate models constructed via quantization. Our Q-learning convergence results and their convergence to near optimality are new for continuous spaces, and the proof method is new even for finite spaces, to our knowledge.
2.Entropic Model Predictive Optimal Transport for Underactuated Linear Systems
Authors:Kaito Ito, Kenji Kashima
Abstract: This letter investigates dynamical optimal transport of underactuated linear systems over an infinite time horizon. In our previous work, we proposed to integrate model predictive control and the celebrated Sinkhorn algorithm to perform efficient dynamical transport of agents. However, the proposed method requires the invertibility of input matrices, which severely limits its applicability. To resolve this issue, we extend the method to (possibly underactuated) controllable linear systems. In addition, we ensure the convergence properties of the method for general controllable linear systems. The effectiveness of the proposed method is demonstrated by a numerical example.
3.Quantile Optimization via Multiple Timescale Local Search for Black-box Functions
Authors:Jiaqiao Hu, Meichen Song, Michael C. Fu
Abstract: We consider quantile optimization of black-box functions that are estimated with noise. We propose two new iterative three-timescale local search algorithms. The first algorithm uses an appropriately modified finite-difference-based gradient estimator that requires $2d$ + 1 samples of the black-box function per iteration of the algorithm, where $d$ is the number of decision variables (dimension of the input vector). For higher-dimensional problems, this algorithm may not be practical if the black-box function estimates are expensive. The second algorithm employs a simultaneous-perturbation-based gradient estimator that uses only three samples for each iteration regardless of problem dimension. Under appropriate conditions, we show the almost sure convergence of both algorithms. In addition, for the class of strongly convex functions, we further establish their (finite-time) convergence rate through a novel fixed-point argument. Simulation experiments indicate that the algorithms work well on a variety of test problems and compare well with recently proposed alternative methods.
4.60 years of cyclic monotonicity: a survey
Authors:A. Kausamo, L. De Pascale, K. Wyczesany
Abstract: The primary purpose of this note is to provide an instructional summary of the state of the art regarding cyclic monotonicity and related notions. We will also present how these notions are tied to optimality in the optimal transport (or Monge-Kantorovich) problem.
5.high-order proximal point algorithm for the monotone variational inequality problem and its application
Authors:Jingyu Gao, Xiurui Geng
Abstract: The proximal point algorithm (PPA) has been developed to solve the monotone variational inequality problem. It provides a theoretical foundation for some methods, such as the augmented Lagrangian method (ALM) and the alternating direction method of multipliers (ADMM). This paper generalizes the PPA to the $p$th-order ($p\geq 1$) and proves its convergence rate $O \left(1/k^{p/2}\right)$ . Additionally, the $p$th-order ALM is proposed based on the $p$th-order PPA. Some numerical experiments are presented to demonstrate the performance of the $p$th-order ALM.
6.A Fast Smoothing Newton Method for Bilevel Hyperparameter Optimization for SVC with Logistic Loss
Authors:Yixin Wang, Qingna Li
Abstract: Support Vector Classification with logistic loss has excellent theoretical properties in classification problems where the label values are not continuous. In this paper, we reformulate the hyperparameter selection for SVC with logistic loss as a bilevel optimization problem in which the upper-level problem and the lower-level problem are both based on logistic loss. The resulting bilevel optimization model is converted to a single-level nonlinear programming (NLP) problem based on the KKT conditions of the lower-level problem. Such NLP contains a set of nonlinear equality constraints and a simple lower bound constraint. The second-order sufficient condition is characterized, which guarantees that the strict local optimizers are obtained. To solve such NLP, we apply the smoothing Newton method proposed in \cite{Liang} to solve the KKT conditions, which contain one pair of complementarity constraints. We show that the smoothing Newton method has a superlinear convergence rate. Extensive numerical results verify the efficiency of the proposed approach and strict local minimizers can be achieved both numerically and theoretically. In particular, compared with other methods, our algorithm can achieve competitive results while consuming less time than other methods.
7.Optimization of piecewise smooth shapes under uncertainty using the example of Navier-Stokes flow
Authors:Caroline Geiersbach, Tim Suchan, Kathrin Welker
Abstract: We investigate a complex system involving multiple shapes to be optimized in a domain, taking into account geometric constraints on the shapes and uncertainty appearing in the physics. We connect the differential geometry of product shape manifolds with multi-shape calculus, which provides a novel framework for the handling of piecewise smooth shapes. This multi-shape calculus is applied to a shape optimization problem where shapes serve as obstacles in a system governed by steady state incompressible Navier-Stokes flow. Numerical experiments use our recently developed stochastic augmented Lagrangian method and we investigate the choice of algorithmic parameters using the example of this application.
8.An efficient sieving based secant method for sparse optimization problems with least-squares constraints
Authors:Qian Li, Defeng Sun, Yancheng Yuan
Abstract: In this paper, we propose an efficient sieving based secant method to address the computational challenges of solving sparse optimization problems with least-squares constraints. A level-set method has been introduced in [X. Li, D.F. Sun, and K.-C. Toh, SIAM J. Optim., 28 (2018), pp. 1842--1866] that solves these problems by using the bisection method to find a root of a univariate nonsmooth equation $\varphi(\lambda) = \varrho$ for some $\varrho > 0$, where $\varphi(\cdot)$ is the value function computed by a solution of the corresponding regularized least-squares optimization problem. When the objective function in the constrained problem is a polyhedral gauge function, we prove that (i) for any positive integer $k$, $\varphi(\cdot)$ is piecewise $C^k$ in an open interval containing the solution $\lambda^*$ to the equation $\varphi(\lambda) = \varrho$; (ii) the Clarke Jacobian of $\varphi(\cdot)$ is always positive. These results allow us to establish the essential ingredients of the fast convergence rates of the secant method. Moreover, an adaptive sieving technique is incorporated into the secant method to effectively reduce the dimension of the level-set subproblems for computing the value of $\varphi(\cdot)$. The high efficiency of the proposed algorithm is demonstrated by extensive numerical results.
1.Non-Myopic Sensor Control for Target Search and Track Using a Sample-Based GOSPA Implementation
Authors:Marcel Hernandez, Angel Garcia-Fernandez, Simon Maskell
Abstract: This paper is concerned with sensor management for target search and track using the generalised optimal subpattern assignment (GOSPA) metric. Utilising the GOSPA metric to predict future system performance is computationally challenging, because of the need to account for uncertainties within the scenario, notably the number of targets, the locations of targets, and the measurements generated by the targets subsequent to performing sensing actions. In this paper, efficient sample-based techniques are developed to calculate the predicted mean square GOSPA metric. These techniques allow for missed detections and false alarms, and thereby enable the metric to be exploited in scenarios more complex than those previously considered. Furthermore, the GOSPA methodology is extended to perform non-myopic (i.e. multi-step) sensor management via the development of a Bellman-type recursion that optimises a conditional GOSPA-based metric. Simulations for scenarios with missed detections, false alarms, and planning horizons of up to three time steps demonstrate the approach, in particular showing that optimal plans align with an intuitive understanding of how taking into account the opportunity to make future observations should influence the current action. It is concluded that the GOSPA-based, non-myopic search and track algorithm offers a powerful mechanism for sensor management.
2.Existence of Markov equilibrium control in discrete time
Authors:Erhan Bayraktar, Bingyan Han
Abstract: For time-inconsistent stochastic controls in discrete time and finite horizon, an open problem in Bj\"ork and Murgoci (Finance Stoch, 2014) is the existence of an equilibrium control. A nonrandomized Borel measurable Markov equilibrium policy exists if the objective is inf-compact in every time step. We provide a sufficient condition for the inf-compactness and thus existence, with costs that are lower semicontinuous (l.s.c.) and bounded from below and transition kernels that are continuous in controls under given states. The control spaces need not to be compact.
3.Self-Healing First-Order Distributed Optimization with Packet Loss
Authors:Israel L. Donato Ridgley, Randy A. Freeman, Kevin M. Lynch
Abstract: We describe SH-SVL, a parameterized family of first-order distributed optimization algorithms that enable a network of agents to collaboratively calculate a decision variable that minimizes the sum of cost functions at each agent. These algorithms are self-healing in that their convergence to the correct optimizer can be guaranteed even if they are initialized randomly, agents join or leave the network, or local cost functions change. We also present simulation evidence that our algorithms are self-healing in the case of dropped communication packets. Our algorithms are the first single-Laplacian methods for distributed convex optimization to exhibit all of these characteristics. We achieve self-healing by sacrificing internal stability, a fundamental trade-off for single-Laplacian methods.
4.Vibrational Stabilization of Cluster Synchronization in Oscillator Networks
Authors:Yuzhen Qin, Alberto Maria Nobili, Danielle S. Bassett, Fabio Pasqualetti
Abstract: Cluster synchronization is of paramount importance for the normal functioning of numerous technological and natural systems. Deviations from normal cluster synchronization patterns are closely associated with various malfunctions, such as neurological disorders in the brain. Therefore, it is crucial to restore normal system functions by stabilizing the appropriate cluster synchronization patterns. Most existing studies focus on designing controllers based on state measurements to achieve system stabilization. However, in many real-world scenarios, measuring system states, such as neuronal activity in the brain, poses significant challenges, rendering the stabilization of such systems difficult. To overcome this challenge, in this paper, we employ an open-loop control strategy, vibrational control, which does not requires any state measurements. We establish some sufficient conditions under which vibrational inputs stabilize cluster synchronization. Further, we provide a tractable approach to design vibrational control. Finally, numerical experiments are conducted to demonstrate our theoretical findings.
1.Comparison of Dynamic Tomato Growth Models for Optimal Control in Greenhouses
Authors:Michael Fink, Annalena Daniels, Cheng Qian, Víctor Martínez Velásquez, Sahil Salotra, Dirk Wollherr
Abstract: As global demand for efficiency in agriculture rises, there is a growing interest in high-precision farming practices. Particularly greenhouses play a critical role in ensuring a year-round supply of fresh produce. In order to maximize efficiency and productivity while minimizing resource use, mathematical techniques such as optimal control have been employed. However, selecting appropriate models for optimal control requires domain expertise. This study aims to compare three established tomato models for their suitability in an optimal control framework. Results show that all three models have similar yield predictions and accuracy, but only two models are currently applicable for optimal control due to implementation limitations. The two remaining models each have advantages in terms of economic yield and computation times, but the differences in optimal control strategies suggest that they require more accurate parameter identification and calibration tailored to greenhouses.
1.Existence theorems for optimal solutions in semi-algebraic optimization
Authors:Jae Hyoung Lee, Gue Myung Lee, Tien Son Pham
Abstract: Consider the problem of minimizing a lower semi-continuous semi-algebraic function $f \colon \mathbb{R}^n \to \mathbb{R} \cup \{+\infty\}$ on an unbounded closed semi-algebraic set $S \subset \mathbb{R}^n.$ Employing adequate tools of semi-algebraic geometry, we first establish some properties of the tangency variety of the restriction of $f$ on $S.$ Then we derive verifiable necessary and sufficient conditions for the existence of optimal solutions of the problem as well as the boundedness from below and coercivity of the restriction of $f$ on $S.$ We also present a computable formula for the optimal value of the problem.
2.Optimal Control of Dynamic District Heating Networks
Authors:Christian Jäkle, Lena Reichle, Stefan Volkwein
Abstract: In the present paper an optimal control problem for a system of differential-algebraic equations (DAEs) is considered. This problem arises in the dynamic optimization of unsteady district heating networks. Based on the Carath\'eodory theory existence of a unique solution to the DAE system is proved using specific properties of the district heating network model. Moreover, it is shown that the optimal control problem possesses optimal solutions. For the numerical experiments different networks are considered including also data from a real district heating network.
3.A Generalized Primal-Dual Correction Method for Saddle-Point Problems with the Nonlinear Coupling Operator
Authors:Sai Wang, Yi Gong
Abstract: Recently, the generalized primal-dual (GPD) method was developed for saddle-point problems (SPPs) with a linear coupling operator. However, the coupling operator in many engineering applications is nonlinear. In this letter, we propose a generalized primal-dual correction method (GPD-CM) to handle SPPs with a nonlinear coupling operator. To achieve this, we customize the proximal matrix and corrective matrix by adjusting the values of regularization factors. By the unified framework, the convergence of GPD-CM is directly obtained. Numerical results on a SPP with an exponential coupling operator support theoretical analysis.
4.Communication-efficient distributed optimization with adaptability to system heterogeneity
Authors:Ziyi Yu, Nikolaos M. Freris
Abstract: We consider the setting of agents cooperatively minimizing the sum of local objectives plus a regularizer on a graph. This paper proposes a primal-dual method in consideration of three distinctive attributes of real-life multi-agent systems, namely: (i)expensive communication, (ii)lack of synchronization, and (iii)system heterogeneity. In specific, we propose a distributed asynchronous algorithm with minimal communication cost, in which users commit variable amounts of local work on their respective sub-problems. We illustrate this both theoretically and experimentally in the machine learning setting, where the agents hold private data and use a stochastic Newton method as the local solver. Under standard assumptions on Lipschitz continuous gradients and strong convexity, our analysis establishes linear convergence in expectation and characterizes the dependency of the rate on the number of local iterations. We proceed a step further to propose a simple means for tuning agents' hyperparameters locally, so as to adjust to heterogeneity and accelerate the overall convergence. Last, we validate our proposed method on a benchmark machine learning dataset to illustrate the merits in terms of computation, communication, and run-time saving as well as adaptability to heterogeneity.
5.Unifying Distributionally Robust Optimization via Optimal Transport Theory
Authors:Jose Blanchet, Daniel Kuhn, Jiajin Li, Bahar Taskesen
Abstract: In the past few years, there has been considerable interest in two prominent approaches for Distributionally Robust Optimization (DRO): Divergence-based and Wasserstein-based methods. The divergence approach models misspecification in terms of likelihood ratios, while the latter models it through a measure of distance or cost in actual outcomes. Building upon these advances, this paper introduces a novel approach that unifies these methods into a single framework based on optimal transport (OT) with conditional moment constraints. Our proposed approach, for example, makes it possible for optimal adversarial distributions to simultaneously perturb likelihood and outcomes, while producing an optimal (in an optimal transport sense) coupling between the baseline model and the adversarial model.Additionally, the paper investigates several duality results and presents tractable reformulations that enhance the practical applicability of this unified framework.
6.Bounding the Difference between the Values of Robust and Non-Robust Markov Decision Problems
Authors:Ariel Neufeld, Julian Sester
Abstract: In this note we provide an upper bound for the difference between the value function of a distributionally robust Markov decision problem and the value function of a non-robust Markov decision problem, where the ambiguity set of probability kernels of the distributionally robust Markov decision process is described by a Wasserstein-ball around some reference kernel whereas the non-robust Markov decision process behaves according to a fixed probability kernel contained in the ambiguity set. Our derived upper bound for the difference between the value functions is dimension-free and depends linearly on the radius of the Wasserstein-ball.
7.Learning (With) Distributed Optimization
Authors:Aadharsh Aadhithya A, Abinesh S, Akshaya J, Jayanth M, Vishnu Radhakrishnan, Sowmya V, Soman K. P
Abstract: This paper provides an overview of the historical progression of distributed optimization techniques, tracing their development from early duality-based methods pioneered by Dantzig, Wolfe, and Benders in the 1960s to the emergence of the Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) algorithm. The initial focus on Lagrangian relaxation for convex problems and decomposition strategies led to the refinement of methods like the Alternating Direction Method of Multipliers (ADMM). The resurgence of interest in distributed optimization in the late 2000s, particularly in machine learning and imaging, demonstrated ADMM's practical efficacy and its unifying potential. This overview also highlights the emergence of the proximal center method and its applications in diverse domains. Furthermore, the paper underscores the distinctive features of ALADIN, which offers convergence guarantees for non-convex scenarios without introducing auxiliary variables, differentiating it from traditional augmentation techniques. In essence, this work encapsulates the historical trajectory of distributed optimization and underscores the promising prospects of ALADIN in addressing non-convex optimization challenges.
8.Disturbance attenuation in the Euler-Bernoulli beam using piezoelectric actuators
Authors:Anton Selivanov, Emilia Fridman
Abstract: We consider a simply-supported Euler-Bernoulli beam with viscous and Kelvin-Voigt damping. Our objective is to attenuate the effect of an unknown distributed disturbance using one piezoelectric actuator. We show how to design a suitable $H_\infty$ state-feedback controller based on a finite number of dominating modes. If the remaining (infinitely many) modes are ignored, the calculated $L^2$ gain is wrong. This happens because of the spillover phenomenon that occurs when the effect of the control on truncated modes is not accounted for in the feedback design. We propose a simple modification of the $H_\infty$ cost that prevents spillover. The key idea is to treat the control as a disturbance in the truncated modes and find the corresponding $L^2$ gains using the bounded real lemma. These $L^2$ gains are added to the control weight in the $H_\infty$ cost for the dominating modes, which prevents spillover. A numerical simulation of an aluminum beam with realistic parameters demonstrates the effectiveness of the proposed method.
9.Intercept Function and Quantity Bidding in Two-stage Electricity Market with Market Power Mitigation
Authors:Rajni Kant Bansal, Yue Chen, Pengcheng You, Enrique Mallada
Abstract: Electricity markets typically operate in two stages, day-ahead and real-time. Despite best efforts striving efficiency, evidence of price manipulation has called for system-level market power mitigation (MPM) initiatives that substitute noncompetitive bids with default bids. Implementing these policies with a limited understanding of participant behavior may lead to unintended economic losses. In this paper, we model the competition between generators and inelastic loads in a two-stage market with stage-wise MPM policies. The loss of Nash equilibrium and lack of guarantee of stable market outcome in the case of conventional supply function bidding motivates the use of an alternative market mechanism where generators bid an intercept function. A Nash equilibrium analysis for a day-ahead MPM policy leads to a Stackelberg-Nash game with loads exercising market power at the expense of generators. A comparison of the resulting equilibrium with the standard market (not implementing any MPM policy) shows that a day-ahead policy completely mitigates the market power of generators. On the other hand, the real-time MPM policy increases demand allocation to real-time, contrary to current market practice with most electricity trades in the day-ahead market. Numerical studies illustrate the impact of the slope of the intercept function on the standard market.
1.Expected decrease for derivative-free algorithms using random subspaces
Authors:Warren Hare, Lindon Roberts, Clément W. Royer
Abstract: Derivative-free algorithms seek the minimum of a given function based only on function values queried at appropriate points. Although these methods are widely used in practice, their performance is known to worsen as the problem dimension increases. Recent advances in developing randomized derivative-free techniques have tackled this issue by working in low-dimensional subspaces that are drawn at random in an iterative fashion. The connection between the dimension of these random subspaces and the algorithmic guarantees has yet to be fully understood. In this paper, we develop an analysis for derivative-free algorithms (both direct-search and model-based approaches) employing random subspaces. Our results leverage linear local approximations of smooth functions to obtain understanding of the expected decrease achieved per function evaluation. Although the quantities of interest involve multidimensional integrals with no closed-form expression, a relative comparison for different subspace dimensions suggest that low dimension is preferable. Numerical computation of the quantities of interest confirm the benefit of operating in low-dimensional subspaces.
2.Modelling and Simulation of District Heating Networks
Authors:Christian Jäkle, Lena Reichle, Stefan Volkwein
Abstract: In the present paper a detailed mathematical model is derived for district heating networks. After semidiscretization of the convective heat equation and introducing coupling conditions at the nodes of the network one gets a high-dimensional system of differential-algebraic equations (DAEs). Neglecting temporal changes of the water velocity in the pipes, the numerical solutions do not change significantly and the DAEs have index one. Numerical experiments illustrate that the model describes the real situation very well.
3.Periodic optimal control of a plug flow reactor model with an isoperimetric constraint
Authors:Yevgeniia Yevgenieva, Alexander Zuyev, Peter Benner, Andreas Seidel-Morgenstern
Abstract: We study a class of nonlinear hyperbolic partial differential equations with boundary control. This class describes chemical reactions of the type ``$A \to$ product'' carried out in a plug flow reactor (PFR) in the presence of an inert component. An isoperimetric optimal control problem with periodic boundary conditions and input constraints is formulated for the considered mathematical model in order to maximize the mean amount of product over the period. For the single-input system, the optimality of a bang-bang control strategy is proved in the class of bounded measurable inputs. The case of controlled flow rate input is also analyzed by exploiting the method of characteristics. A case study is performed to illustrate the performance of the reaction model under different control strategies.
4.Fourier series and sidewise profile control of 1-d waves
Authors:E. Zuazua
Abstract: We discuss the sidewise control properties of 1-d waves. In analogy with classical control and inverse problems for wave propagation, the problem consists on controlling the behaviour of waves on part of the boundary of the domain where they propagate, by means of control actions localised on a different subset of the boundary. In contrast with classical problems, the goal is not to control the dynamics of the waves on the interior of the domain, but rather their boundary traces. It is therefore a goal oriented controllability problem. We propose a duality method that reduces the problem to suitable new observability inequalities, which consist of estimating the boundary traces of waves on part of the boundary from boundary measurements done on another subset of the boundary. These inequalities lead to novel questions that do not seem to be treatable by the classical techniques employed in the field, such as Carleman inequalities, non-harmonic Fourier series, microlocal analysis and multipliers. We propose a genuinely 1-d solution method, based on sidewise energy propagation estimates yielding a complete sharp solution. The obtained observability results can be reinterpreted in terms of Fourier series. This leads to new non-standard questions in the context of non-harmonic Fourier series.
5.How to induce regularization in generalized linear models: A guide to reparametrizing gradient flow
Authors:Hung-Hsu Chou, Johannes Maly, Dominik Stöger
Abstract: In this work, we analyze the relation between reparametrizations of gradient flow and the induced implicit bias on general linear models, which encompass various basic classification and regression tasks. In particular, we aim at understanding the influence of the model parameters - reparametrization, loss, and link function - on the convergence behavior of gradient flow. Our results provide user-friendly conditions under which the implicit bias can be well-described and convergence of the flow is guaranteed. We furthermore show how to use these insights for designing reparametrization functions that lead to specific implicit biases like $\ell_p$- or trigonometric regularizers.
6.Comparative analysis of mathematical formulations for the two-dimensional guillotine cutting problem
Authors:Henrique Becker, Mateus Martin, Olinto Araujo, Luciana S. Buriol, Reinaldo Morabito
Abstract: About ten years ago, a paper proposed the first integer linear programming formulation for the constrained two-dimensional guillotine cutting problem (with unlimited cutting stages). Since, six other formulations followed, five of them in the last two years. This spike of interest gave no opportunity for a comprehensive comparison between the formulations. We review each formulation and compare their empirical results over instance datasets of the literature. We adapt most formulations to allow for piece rotation. The possibility of adaptation was already predicted but not realized by the prior work. The results show the dominance of pseudo-polynomial formulations until the point instances become intractable by them, while more compact formulations keep achieving good primal solutions. Our study also reveals a small but consistent advantage of the Gurobi solver over the CPLEX solver in our context; that the choice of solver hardly benefits one formulation over another; and a mistake in the generation of the T instances, which should have the same optima with or without guillotine cuts. Our study also proposes hybridising the most recent formulation with a prior formulation for a restricted version of the problem. The hybridisations show a reduction of about 20% of the branch-and-bound time thanks to the symmetries broken by the hybridisation.
7.Boosting Data-Driven Mirror Descent with Randomization, Equivariance, and Acceleration
Authors:Hong Ye Tan, Subhadip Mukherjee, Junqi Tang, Carola-Bibiane Schönlieb
Abstract: Learning-to-optimize (L2O) is an emerging research area in large-scale optimization for data science applications. Very recently, researchers have proposed a novel L2O framework called learned mirror descent (LMD), based on the classical mirror descent (MD) algorithm, with learnable mirror maps parameterized by input-convex neural networks. The LMD approach has been shown to significantly accelerate convex solvers while inheriting the convergence properties of the classical MD algorithm. Despite the initial successes in small-/mid-scale optimization problems demonstrating the potential of this framework, there is still a long way to go to make this scheme scalable and practical for high-dimensional problems. In this work, we provide several practical extensions of the LMD algorithm. We first propose accelerated and stochastic variants of LMD, leveraging classical momentum-based acceleration and stochastic optimization techniques for improving the convergence rate and per-iteration complexity. Moreover, for the particular application of training neural networks, we derive and propose a novel and efficient parameterization for the mirror potential, exploiting the equivariant structure of the training problems to significantly reduce the dimensionality of the underlying problem. We provide theoretical convergence guarantees for our schemes under standard assumptions, and demonstrate their effectiveness in various computational imaging and machine learning applications such as image inpainting and the training of SVMs.
8.A Nesterov type algorithm with double Tikhonov regularization: fast convergence of the function values and strong convergence to the minimal norm solution
Authors:Mikhail Karapetyants, Szilárd Csaba László
Abstract: We investigate the strong convergence properties of a Nesterov type algorithm with two Tikhonov regularization terms in connection to the minimization problem of a smooth convex function $f.$ We show that the generated sequences converge strongly to the minimal norm element from $\argmin f$. We also show that from a practical point of view the Tikhonov regularization does not affect Nesterov's optimal convergence rate of order $\mathcal{O}(n^{-2})$ for the potential energies $f(x_n)-\min f$ and $f(y_n)-\min f$, where $(x_n),\,(y_n)$ are the sequences generated by our algorithm. Further, we obtain fast convergence to zero of the discrete velocity, but also some estimates concerning the value of the gradient of the objective function in the generated sequences.
9.Impact of environmental constraints in hydrothermal energy planning
Authors:Luís Felipe Bueno, André Luiz Diniz, Rafael Durbano Lobato, Claudia Sagastizábal, Kenny Vinente
Abstract: As a follow-up of the industrial problems dealt with in 2018, 2019, 2021 and 2022, in partnership with CCEE and CEPEL, in 2023 the study group Energy planning and environmental constraints focused on the impact that prioritizing multiple uses of water has on the electric energy production systems, specially in predominantly hydro systems, which is the case of Brazil. In order to model environmental constraints in the long-term hydrothermal generation planning problem, the resulting large-scale multi-stage linear programming problem was modelled in JuMP and solved by stochastic dual dynamic programming. To assess if the development represented well the behavior of the Brazilian power system, the Julia formulation first was benchmarked with Brazil s official model, Newave. Environmental constraints were introduced in this problem by two different approaches, one that represents the multiple uses of water by means of 0-1 variables, and another one that makes piecewise linear approximations of the relevant constraints. Numerical results show that penalties of slack variables strongly affect the obtained water values.
10.Optimal design of vaccination policies: A case study for Newfoundland and Labrador
Authors:Faraz Khoshbakhtian, Hamidreza Validi, Mario Ventresca, Dionne Aleman
Abstract: This paper proposes pandemic mitigation vaccination policies for Newfoundland and Labrador (NL) based on two compact mixed integer programming (MIP) models of the distance-based critical node detection problem (DCNDP). Our main focus is on two variants of the DCNDP that seek to minimize the number of connections with lengths of at most one (1-DCNDP) and two (2-DCNDP). A polyhedral study for the 1-DCNDP is conducted, and new aggregated inequalities are provided for the 2-DCNDP. The computational experiments show that the 2-DCNDP with aggregated inequalities outperforms the one with disaggregated inequalities for graphs with a density of at least 0.5%. We also study the strategic vaccine allocation problem as a real-world application of the DCNDP and conduct a set of computational experiments on a simulated contact network of NL. Our computational results demonstrate that the DCNDP-based strategies can have a better performance in comparison with the real-world strategies implemented during COVID-19.
11.Improving preference disaggregation in multicriteria decision making: incorporating time series analysis and a multi-objective approach
Authors:Betania S. C. Campello, Sarah BenAmor, Leonardo T. Duarte, João Marcos Travassos Romano
Abstract: Preference disaggregation analysis (PDA) is a widely used approach in multicriteria decision analysis that aims to extract preferential information from holistic judgments provided by decision makers. This paper presents an original methodological framework for PDA that addresses two significant challenges in this field. Firstly, it considers the multidimensional structure of data to capture decision makers' preferences based on descriptive measures of the criteria time series, such as trend and average. This novel approach enables an understanding of decision makers' preferences in decision-making scenarios involving time series analysis, which is common in medium- to long-term impact decisions. Secondly, the paper addresses the robustness issue commonly encountered in PDA methods by proposing a multi-objective and Monte Carlo simulation approach. This approach enables the consideration of multiple preference models and provides a mechanism to converge towards the most likely preference model. The proposed method is evaluated using real data, demonstrating its effectiveness in capturing preferences based on criteria and time series descriptive measures. The multi-objective analysis highlights the generation of multiple solutions, and, under specific conditions, reveals the possibility of achieving convergence towards a single solution that represents the decision maker's preferences.
1.Symplectic Discretization Approach for Developing New Proximal Point Algorithms
Authors:Ya-xiang Yuan, Yi Zhang
Abstract: Proximal point algorithms have found numerous applications in the field of convex optimization, and their accelerated forms have also been proposed. However, the most commonly used accelerated proximal point algorithm was first introduced in 1967, and recent studies on accelerating proximal point algorithms are relatively scarce. In this paper, we propose high-resolution ODEs for the proximal point operators for both closed proper convex functions and maximally monotone operators, and present a Lyapunov function framework to demonstrate that the trajectories of our high-resolution ODEs exhibit accelerated behavior. Subsequently, by symplectically discretizing our high-resolution ODEs, we obtain new proximal point algorithms known as symplectic proximal point algorithms. By decomposing the continuous-time Lyapunov function into its elementary components, we demonstrate that symplectic proximal point algorithms possess $O(1/k^2)$ convergence rates.
1.Non-Convex Bilevel Optimization with Time-Varying Objective Functions
Authors:Sen Lin, Daouda Sow, Kaiyi Ji, Yingbin Liang, Ness Shroff
Abstract: Bilevel optimization has become a powerful tool in a wide variety of machine learning problems. However, the current nonconvex bilevel optimization considers an offline dataset and static functions, which may not work well in emerging online applications with streaming data and time-varying functions. In this work, we study online bilevel optimization (OBO) where the functions can be time-varying and the agent continuously updates the decisions with online streaming data. To deal with the function variations and the unavailability of the true hypergradients in OBO, we propose a single-loop online bilevel optimizer with window averaging (SOBOW), which updates the outer-level decision based on a window average of the most recent hypergradient estimations stored in the memory. Compared to existing algorithms, SOBOW is computationally efficient and does not need to know previous functions. To handle the unique technical difficulties rooted in single-loop update and function variations for OBO, we develop a novel analytical technique that disentangles the complex couplings between decision variables, and carefully controls the hypergradient estimation error. We show that SOBOW can achieve a sublinear bilevel local regret under mild conditions. Extensive experiments across multiple domains corroborate the effectiveness of SOBOW.
2.Multi-criteria scheduling of realistic flexible job shop: a novel approach for integrating simulation modelling and multi-criteria decision making
Authors:M. Thenarasu G-SCOP\_DOME2S, K. Rameshkumar G-SCOP\_DOME2S, M. Di Mascolo G-SCOP\_DOME2S, S. P. Anbuudayasankar
Abstract: Increased flexibility in job shops leads to more complexity in decision-making for shop floor engineers. Partial Flexible Job Shop Scheduling (PFJSS) is a subset of Job shop problems and has substantial application in the real world. Priority Dispatching Rules (PDRs) are simple and easy to implement for making quick decisions in real-time. The present study proposes a novel method of integrating Multi-Criteria Decision Making (MCDM) methods and the Discrete Event Simulation (DES) Model to define job priorities in large-scale problems involving multiple criteria. DES approach is employed to model the PFJSS to evaluate Makespan, Flow Time, and Tardiness-based measures considering static and dynamic job arrivals. The proposed approach is implemented in a benchmark problem and large-scale PFJSS. The integration of MCDM methods and simulation models offers the flexibility to choose the parameters that need to govern the ranking of jobs. The solution given by the proposed methods is tested with the best-performing Composite Dispatching Rules (CDR), combining several PDR, which are available in the literature. Proposed MCDM approaches perform well for Makespan, Flow Time, and Tardiness-based measures for large-scale real-world problems. The proposed methodology integrated with the DES model is easy to implement in a real-time shop floor environment.
3.Optimal Design of Lines Replaceable Units
Authors:Joni Driessen, Joost de Kruijf, Joachim Arts, Geert-Jan van Houtum
Abstract: A Line Replaceable Unit (LRU) is a collection of connected parts in a system that is replaced when any part of the LRU fails. Companies use LRUs as a mechanism to reduce downtime of systems following a failure. The design of LRUs determines how fast a replacement is performed, so a smart design reduces replacement and downtime cost. A firm must purchase/repair a LRU upon failure, and large LRUs are more expensive to purchase/repair. Hence, a firm seeks to design LRUs such that the average costs per time unit are minimized. We formalize this problem in a new model that captures how parts in a system are connected, and how they are disassembled from the system. Our model optimizes the design of LRUs such that the replacement (and downtime) costs and LRU purchase/repair costs are minimized. We present a set partitioning formulation for which we prove a rare result: the optimal solution is integer, despite a non--integral feasible polyhedron. Secondly, we formulate our problem as a binary linear program. The paper concludes by numerically comparing the computation times of both formulations and illustrates the effects of various parameters on the model's outcome.
4.Approximate propagation of normal distributions for stochastic optimal control of nonsmooth systems
Authors:Florian Messerer, Katrin Baumgärtner, Armin Nurkanović, Moritz Diehl
Abstract: We present a method for the approximate propagation of mean and covariance of a probability distribution through ordinary differential equations (ODE) with discontinous right-hand side. For piecewise affine systems, a normalization of the propagated probability distribution at every time step allows us to analytically compute the expectation integrals of the mean and covariance dynamics while explicitly taking into account the discontinuity. This leads to a natural smoothing of the discontinuity such that for relevant levels of uncertainty the resulting ODE can be integrated directly with standard schemes and it is neither necessary to prespecify the switching sequence nor to use a switch detection method. We then show how this result can be employed in the more general case of piecewise smooth functions based on a structure preserving linearization scheme. The resulting dynamics can be straightforwardly used within standard formulations of stochastic optimal control problems with chance constraints.
5.Feasible approximation of matching equilibria for large-scale matching for teams problems
Authors:Ariel Neufeld, Qikun Xiang
Abstract: We propose a numerical algorithm for computing approximately optimal solutions of the matching for teams problem. Our algorithm is efficient for problems involving a large number of agent categories and allows for the measures describing the agent types to be non-discrete. Specifically, we parametrize the so-called transfer functions and develop a parametric version of the dual formulation. Our algorithm tackles this parametric formulation and produces feasible and approximately optimal solutions for the primal and dual formulations of the matching for teams problem. These solutions also yield upper and lower bounds for the optimal value, and the difference between the upper and lower bounds provides a direct sub-optimality estimate of the computed solutions. Moreover, we are able to control a theoretical upper bound on the sub-optimality to be arbitrarily close to 0 under mild conditions. We subsequently prove that the approximate primal and dual solutions converge when the sub-optimality goes to 0 and their limits constitute a true matching equilibrium. Thus, the outputs of our algorithm are regarded as an approximate matching equilibrium. We also analyze the theoretical computational complexity of our parametric formulation as well as the sparsity of the resulting approximate matching equilibrium. Through numerical experiments, we showcase that the proposed algorithm can produce high-quality approximate matching equilibria and is applicable to versatile settings, including a high-dimensional setting involving 100 agent categories.
6.A Branch-and-Cut-and-Price Algorithm for Cutting Stock and Related Problems
Authors:Renan F. F. da Silva, Rafael C. S. Schouery
Abstract: We present a branch-and-cut-and-price framework to solve Cutting Stock Problems with strong relaxations using Set Covering (Partition) Formulations, which are solved by column generation. We propose an extended Ryan-Foster branching scheme for non-binary models, a pricing algorithm that converges in a few iterations, and a variable selection algorithm based on branching history. These strategies are combined with subset-row cuts and custom primal heuristics to create a framework that overcomes the current state-of-the-art for the following problems: Cutting Stock, Skiving Stock, Ordered Open-End Bin Packing, Class-Constrained Bin Packing, and Identical Parallel Machines Scheduling with Minimum Makespan. Additionally, a new challenging benchmark for Cutting Stock is introduced.
7.RIP-based Performance Guarantee for Low Rank Matrix Recovery via $L_{*-F}$ Minimization
Authors:Yan Li, Liping Zhang
Abstract: In the undetermined linear system $\bm{b}=\mathcal{A}(\bm{X})+\bm{s}$, vector $\bm{b}$ and operator $\mathcal{A}$ are the known measurements and $\bm{s}$ is the unknown noise. In this paper, we investigate sufficient conditions for exactly reconstructing desired matrix $\bm{X}$ being low-rank or approximately low-rank. We use the difference of nuclear norm and Frobenius norm ($L_{*-F}$) as a surrogate for rank function and establish a new nonconvex relaxation of such low rank matrix recovery, called the $L_{*-F}$ minimization, in order to approximate the rank function closer. For such nonconvex and nonsmooth constrained $L_{*-F}$ minimization problems, based on whether the noise level is $0$, we give the upper bound estimation of the recovery error respectively. Particularly, in the noise-free case, one sufficient condition for exact recovery is presented. If linear operator $\mathcal{A}$ satisfies the restricted isometry property with $\delta_{4r}<\frac{\sqrt{2r}-1}{\sqrt{2r}-1+\sqrt{2}(\sqrt{2r}+1)}$, then $r$-\textbf{rank} matrix $\bm{X}$ can be exactly recovered without other assumptions. In addition, we also take insights into the regularized $L_{*-F}$ minimization model since such regularized model is more widely used in algorithm design. We provide the recovery error estimation of this regularized $L_{*-F}$ minimization model via RIP tool. To our knowledge, this is the first result on exact reconstruction of low rank matrix via regularized $L_{*-F}$ minimization.
8.Almost-sure convergence of iterates and multipliers in stochastic sequential quadratic optimization
Authors:Frank E. Curtis, Xin Jiang, Qi Wang
Abstract: Stochastic sequential quadratic optimization (SQP) methods for solving continuous optimization problems with nonlinear equality constraints have attracted attention recently, such as for solving large-scale data-fitting problems subject to nonconvex constraints. However, for a recently proposed subclass of such methods that is built on the popular stochastic-gradient methodology from the unconstrained setting, convergence guarantees have been limited to the asymptotic convergence of the expected value of a stationarity measure to zero. This is in contrast to the unconstrained setting in which almost-sure convergence guarantees (of the gradient of the objective to zero) can be proved for stochastic-gradient-based methods. In this paper, new almost-sure convergence guarantees for the primal iterates, Lagrange multipliers, and stationarity measures generated by a stochastic SQP algorithm in this subclass of methods are proved. It is shown that the error in the Lagrange multipliers can be bounded by the distance of the primal iterate to a primal stationary point plus the error in the latest stochastic gradient estimate. It is further shown that, subject to certain assumptions, this latter error can be made to vanish by employing a running average of the Lagrange multipliers that are computed during the run of the algorithm. The results of numerical experiments are provided to demonstrate the proved theoretical guarantees.
9.Quadratic-exponential coherent feedback control of linear quantum stochastic systems
Authors:Igor G. Vladimirov, Ian R. Petersen
Abstract: This paper considers a risk-sensitive optimal control problem for a field-mediated interconnection of a quantum plant with a coherent (measurement-free) quantum controller. The plant and the controller are multimode open quantum harmonic oscillators governed by linear quantum stochastic differential equations, which are coupled to each other and driven by multichannel quantum Wiener processes modelling the external bosonic fields. The control objective is to internally stabilize the closed-loop system and minimize the infinite-horizon asymptotic growth rate of a quadratic-exponential functional which penalizes the plant variables and the controller output. We obtain first-order necessary conditions of optimality for this problem by computing the partial Frechet derivatives of the cost functional with respect to the energy and coupling matrices of the controller in frequency domain and state space. An infinitesimal equivalence between the risk-sensitive and weighted coherent quantum LQG control problems is also established. In addition to variational methods, we employ spectral factorizations and infinite cascades of auxiliary classical systems. Their truncations are applicable to numerical optimization algorithms (such as the gradient descent) for coherent quantum risk-sensitive feedback synthesis.
1.Optimization on Pareto sets: On a theory of multi-objective optimization
Authors:Abhishek Roy, Geelon So, Yi-An Ma
Abstract: In multi-objective optimization, a single decision vector must balance the trade-offs between many objectives. Solutions achieving an optimal trade-off are said to be Pareto optimal: these are decision vectors for which improving any one objective must come at a cost to another. But as the set of Pareto optimal vectors can be very large, we further consider a more practically significant Pareto-constrained optimization problem, where the goal is to optimize a preference function constrained to the Pareto set. We investigate local methods for solving this constrained optimization problem, which poses significant challenges because the constraint set is (i) implicitly defined, and (ii) generally non-convex and non-smooth, even when the objectives are. We define notions of optimality and stationarity, and provide an algorithm with a last-iterate convergence rate of $O(K^{-1/2})$ to stationarity when the objectives are strongly convex and Lipschitz smooth.
2.Completely Abstract Dynamic Programming
Authors:Thomas J. Sargent, John Stachurski
Abstract: We introduce a completely abstract dynamic programming framework in which dynamic programs are sets of policy operators acting on a partially ordered space. We provide an optimality theory based on high-level assumptions. We then study symmetric and asymmetric relationships between dynamic programs, and show how these relationships transmit optimality properties. Our formulation includes and extends applications of dynamic programming across many fields.
3.Optimal Control of Stationary Doubly Diffusive Flows on Two and Three Dimensional Bounded Lipschitz Domains: A Theoretical Study
Authors:Jai Tushar, Arbaz Khan, Manil T. Mohan
Abstract: In this work, a theoretical framework is developed to study the control constrained distributed optimal control of a stationary double diffusion model presented in [Burger, Mendez, Ruiz-Baier, SINUM (2019), 57:1318-1343]. For the control problem, as the source term belongs to a weaker space, a new solvability analysis of the governing equation is presented using Faedo- Galerkin approximation techniques. Some new minimal regularity results for the governing equation are established on two and three-dimensional bounded Lipschitz domains and are of independent interest. Moreover, we show the existence of an optimal control with quadratic type cost functional, study the Frechet differentiability properties of the control-to-state map and establish the first-order necessary optimality conditions corresponding to the optimal control problem.
4.Adaptive Proximal Gradient Method for Convex Optimization
Authors:Yura Malitsky, Konstantin Mishchenko
Abstract: In this paper, we explore two fundamental first-order algorithms in convex optimization, namely, gradient descent (GD) and proximal gradient method (ProxGD). Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions. We propose adaptive versions of GD and ProxGD that are based on observed gradient differences and, thus, have no added computational costs. Moreover, we prove convergence of our methods assuming only local Lipschitzness of the gradient. In addition, the proposed versions allow for even larger stepsizes than those initially suggested in [MM20].
5.Blessing of High-Order Dimensionality: from Non-Convex to Convex Optimization for Sensor Network Localization
Authors:Mingyu Lei, Jiayu Zhang, Yinyu Ye
Abstract: This paper investigates the Sensor Network Localization (SNL) problem, which seeks to determine sensor locations based on known anchor locations and partially given anchors-sensors and sensors-sensors distances. Two primary methods for solving the SNL problem are analyzed: the low-dimensional method that directly minimizes a loss function, and the high-dimensional semi-definite relaxation (SDR) method that reformulates the SNL problem as an SDP (semi-definite programming) problem. The paper primarily focuses on the intrinsic non-convexity of the loss function of the low-dimensional method, which is shown in our main theorem. The SDR method, via second-order dimension augmentation, is discussed in the context of its ability to transform non-convex problems into convex ones; while the first-order direct dimension augmentation fails. Additionally, we will show that more edges don't necessarily contribute to the better convexity of the loss function. Moreover, we provide an explanation for the success of the SDR+GD (gradient descent) method which uses the SDR solution as a warm-start of the minimization of the loss function by gradient descent. The paper also explores the parallels among SNL, max-cut, and neural networks in terms of the blessing of high-order dimension augmentation.
6.Approximation of deterministic mean field type control systems
Authors:Yurii Averboukh
Abstract: The paper is concerned with the approximation of the deterministic the mean field type control system by a mean field Markov chain. It turns out that the dynamics of the distribution in the approximating system is described by a system of ordinary differential equations. Given a strategy for the Markov chain, we explicitly construct a control in the deterministic mean field type control system. Our method is a realization of the model predictive approach. The converse construction is also presented. These results lead to an estimate of the Hausdorff distance between the bundles of motions in the deterministic mean field type control system and the mean field Markov chain. Especially, we pay the attention to the case when one can approximate the bundle of motions in the mean field type system by solutions of a finite systems of ODEs.
1.Optimal Distributed Control for a Cahn-Hilliard-Darcy System with Mass Sources, Unmatched Viscosities and Singular Potential
Authors:Marco Abatangelo, Cecilia Cavaterra, Maurizio Grasselli, Hao Wu
Abstract: We study a Cahn-Hilliard-Darcy system in two dimensions with mass sources, unmatched viscosities and singular potential. This system is equipped with no-flux boundary conditions for the (volume) averaged velocity $\mathbf{u}$, the difference of the volume fractions $\varphi$, and the chemical potential $\mu$, along with an initial condition for $\varphi$. The resulting initial boundary value problem can be considered as a basic, though simplified, model for the evolution of solid tumor growth. The source term in the Cahn-Hilliard equation contains a control $R$ that can be thought, for instance, as a drug or a nutrient. Our goal is to study an optimal control problem with a tracking type cost functional given by the sum of three $L^2$ norms involving $\varphi(T)$ ($T>0$ is the final time), $\varphi$ and $R$. We first prove the existence and uniqueness of a global strong solution with $\varphi$ being strictly separated from the pure phases $\pm 1$. Thanks to this result, we are able to analyze the control-to-state mapping $\mathcal{S}: R \mapsto \varphi$, obtaining the existence of an optimal control, the Fr\'{e}chet differentiability of $\mathcal{S}$ and first-order necessary optimality conditions expressed through a suitable variational inequality for the adjoint variables. Finally, we show the differentiability of the control-to-costate operator and establish a second-order sufficient condition for the strict local optimality.
2.Efficiency of First-Order Methods for Low-Rank Tensor Recovery with the Tensor Nuclear Norm Under Strict Complementarity
Authors:Dan Garber, Atara Kaplan
Abstract: We consider convex relaxations for recovering low-rank tensors based on constrained minimization over a ball induced by the tensor nuclear norm, recently introduced in \cite{tensor_tSVD}. We build on a recent line of results that considered convex relaxations for the recovery of low-rank matrices and established that under a strict complementarity condition (SC), both the convergence rate and per-iteration runtime of standard gradient methods may improve dramatically. We develop the appropriate strict complementarity condition for the tensor nuclear norm ball and obtain the following main results under this condition: 1. When the objective to minimize is of the form $f(\mX)=g(\mA\mX)+\langle{\mC,\mX}\rangle$ , where $g$ is strongly convex and $\mA$ is a linear map (e.g., least squares), a quadratic growth bound holds, which implies linear convergence rates for standard projected gradient methods, despite the fact that $f$ need not be strongly convex. 2. For a smooth objective function, when initialized in certain proximity of an optimal solution which satisfies SC, standard projected gradient methods only require SVD computations (for projecting onto the tensor nuclear norm ball) of rank that matches the tubal rank of the optimal solution. In particular, when the tubal rank is constant, this implies nearly linear (in the size of the tensor) runtime per iteration, as opposed to super linear without further assumptions. 3. For a nonsmooth objective function which admits a popular smooth saddle-point formulation, we derive similar results to the latter for the well known extragradient method. An additional contribution which may be of independent interest, is the rigorous extension of many basic results regarding tensors of arbitrary order, which were previously obtained only for third-order tensors.
3.Topology Optimization for Uniform Flow Distribution in Electrolysis Cells
Authors:Leon Baeck, Sebastian Blauth, Christian Leithäuser, René Pinnau, Kevin Sturm
Abstract: In this paper we consider the topology optimization for a bipolar plate of a hydrogen electrolysis cell. We present a model for the bipolar plate using the Stokes equation with an additional drag term, which models the influence of fluid and solid regions. Furthermore, we derive a criterion for a uniform flow distribution in the bipolar plate. To obtain shapes that are well-manufacturable, we introduce a novel smoothing technique for the fluid velocity. Finally, we present some numerical results and investigate the influence of the smoothing on the obtained shapes.
4.Subspace-Constrained Continuous Methane Leak Monitoring and Optimal Sensor Placement
Authors:Kashif Rashid, Lukasz Zielinski, Junyi Yuan, Andrew Speck
Abstract: This work presents a procedure that can quickly identify and isolate methane emission sources leading to expedient remediation. Minimizing the time required to identify a leak and the subsequent time to dispatch repair crews can significantly reduce the amount of methane released into the atmosphere. The procedure developed utilizes permanently installed low-cost methane sensors at an oilfield facility to continuously monitor leaked gas concentration above background levels. The methods developed for optimal sensor placement and leak inversion in consideration of predefined subspaces and restricted zones are presented. In particular, subspaces represent regions comprising one or more equipment items that may leak, and restricted zones define regions in which a sensor may not be placed due to site restrictions by design. Thus, subspaces constrain the inversion problem to specified locales, while restricted zones constrain sensor placement to feasible zones. The development of synthetic wind models, and those based on historical data, are also presented as a means to accommodate optimal sensor placement under wind uncertainty. The wind models serve as realizations for planning purposes, with the aim of maximizing the mean coverage measure for a given number of sensors. Once the optimal design is established, continuous real-time monitoring permits localization and quantification of a methane leak source. The necessary methods, mathematical formulation and demonstrative test results are presented.
5.Energy System Optimisation using (Mixed Integer) Linear Programming
Authors:Sebastian Miehling, Andreas Hanel, Jerry Lambert, Sebastian Fendt, Hartmut Spliethoff
Abstract: Although energy system optimisation based on linear optimisation is often used for influential energy outlooks and studies for political decision-makers, the underlying background still needs to be described in the scientific literature in a concise and general form. This study presents the main equations and advanced ideas and explains further possibilities mixed integer linear programming offers in energy system optimisation. Furthermore, the equations are shown using an example system to present a more practical point of view. Therefore, this study is aimed at researchers trying to understand the background of studies using energy system optimisation and researchers building their implementation into a new framework. This study describes how to build a standard model, how to implement advanced equations using linear programming, and how to implement advanced equations using mixed integer linear programming, as well as shows a small exemplary system. - Presentation of the OpTUMus energy system optimisation framework - Set of equations for a fully functional energy system model - Example of a simple energy system model
1.Accelerated Benders Decomposition for Variable-Height Transport Packaging Optimisation
Authors:Alain Lehmann, Wilhelm Kleiminger, Hakim Invernizzi, Aurel Gautschi
Abstract: This paper tackles the problem of finding optimal variable-height transport packaging. The goal is to reduce the empty space left in a box when shipping goods to customers, thereby saving on filler and reducing waste. We cast this problem as a large-scale mixed integer problem (with over seven billion variables) and demonstrate various acceleration techniques to solve it efficiently in about three hours on a laptop. We present a KD-Tree algorithm to avoid exhaustive grid evaluation of the 3D-bin-packing, provide analytical transformations to accelerate the Benders decomposition, and an efficient implementation of the Benders sub problem for significant memory savings and a three order of magnitude runtime speedup.
2.Multiobjective Optimization of Non-Smooth PDE-Constrained Problems
Authors:Marco Bernreuther, Michael Dellnitz, Bennet Gebken, Georg Müller, Sebastian Peitz, Konstantin Sonntag, Stefan Volkwein
Abstract: Multiobjective optimization plays an increasingly important role in modern applications, where several criteria are often of equal importance. The task in multiobjective optimization and multiobjective optimal control is therefore to compute the set of optimal compromises (the Pareto set) between the conflicting objectives. The advances in algorithms and the increasing interest in Pareto-optimal solutions have led to a wide range of new applications related to optimal and feedback control - potentially with non-smoothness both on the level of the objectives or in the system dynamics. This results in new challenges such as dealing with expensive models (e.g., governed by partial differential equations (PDEs)) and developing dedicated algorithms handling the non-smoothness. Since in contrast to single-objective optimization, the Pareto set generally consists of an infinite number of solutions, the computational effort can quickly become challenging, which is particularly problematic when the objectives are costly to evaluate or when a solution has to be presented very quickly. This article gives an overview of recent developments in the field of multiobjective optimization of non-smooth PDE-constrained problems. In particular we report on the advances achieved within Project 2 "Multiobjective Optimization of Non-Smooth PDE-Constrained Problems - Switches, State Constraints and Model Order Reduction" of the DFG Priority Programm 1962 "Non-smooth and Complementarity-based Distributed Parameter Systems: Simulation and Hierarchical Optimization".
3.Optimal Mixed Strategies to the Zero-sum Linear Differential Game
Authors:Tao Xu, Wang Xi, Jianping He
Abstract: This paper exploits the weak approximation method to study a zero-sum linear differential game under mixed strategies. The stochastic nature of mixed strategies poses challenges in evaluating the game value and deriving the optimal strategies. To overcome these challenges, we first define the mixed strategy based on time discretization given the control period $\delta$. Then, we design a stochastic differential equation (SDE) to approximate the discretized game dynamic with a small approximation error of scale $\mathcal{O}(\delta^2)$ in the weak sense. Moreover, we prove that the game payoff is also approximated in the same order of accuracy. Next, we solve the optimal mixed strategies and game values for the linear quadratic differential games. The effect of the control period is explicitly analyzed when the payoff is a terminal cost. Our results provide the first implementable form of the optimal mixed strategies for a zero-sum linear differential game. Finally, we provide numerical examples to illustrate and elaborate on our results.
4.Stochastic smoothing accelerated gradient method for nonsmooth convex composite optimization
Authors:Ruyu Wang, Chao Zhang
Abstract: We propose a novel stochastic smoothing accelerated gradient (SSAG) method for general constrained nonsmooth convex composite optimization, and analyze the convergence rates. The SSAG method allows various smoothing techniques, and can deal with the nonsmooth term that is not easy to compute its proximal term, or that does not own the linear max structure. To the best of our knowledge, it is the first stochastic approximation type method with solid convergence result to solve the convex composite optimization problem whose nonsmooth term is the maximization of numerous nonlinear convex functions. We prove that the SSAG method achieves the best-known complexity bounds in terms of the stochastic first-order oracle ($\mathcal{SFO}$), using either diminishing smoothing parameters or a fixed smoothing parameter. We give two applications of our results to distributionally robust optimization problems. Numerical results on the two applications demonstrate the effectiveness and efficiency of the proposed SSAG method.
5.Revitalizing Public Transit in Low Ridership Areas: An Exploration of On-Demand Multimodal Transit Systems
Authors:Jiawei Lu, Connor Riley, Krishna Murthy Gurumurthy, Pascal Van Hentenryck
Abstract: Public transit plays an essential role in mitigating traffic congestion, reducing emissions, and enhancing travel accessibility and equity. One of the critical challenges in designing public transit systems is distributing finite service supplies temporally and spatially to accommodate time-varying and space-heterogeneous travel demands. Particularly, for regions with low or scattered ridership, there is a dilemma in designing traditional transit lines and corresponding service frequencies. Dense transit lines and high service frequency increase operation costs, while sparse transit lines and low service frequency result in poor accessibility and long passenger waiting time. In the coming era of Mobility-as-a-Service, the aforementioned challenge is expected to be addressed by on-demand services. In this study, we design an On-Demand Multimodel Transit System (ODMTS) for regions with low or scattered travel demands, in which some low-ridership bus lines are replaced with flexible on-demand ride-sharing shuttles. In the proposed ODMTS, riders within service regions can request shuttles to finish their trips or to connect to fixed-route services such as bus, metro, and light rail. Leveraging the integrated transportation system modeling platform, POLARIS, a simulation-based case study is conducted to assess the effectiveness of this system in Austin, Texas.
1.Practical asymptotic stability of data-driven model predictive control using extended DMD
Authors:Lea Bold, Lars Grüne, Manuel Schaller, Karl Worthmann
Abstract: The extended Dynamic Mode Decomposition (eDMD) is a very popular method to obtain data-driven surrogate models for nonlinear (control) systems governed by ordinary and stochastic differential equations. Its theoretical foundation is the Koopman framework, in which one propagates observable functions of the state to obtain a linear representation in an infinite-dimensional space. In this work, we prove practical asymptotic stability of a (controlled) equilibrium for eDMD-based model predictive control, in which the optimization step is conducted using the data-based surrogate model. To this end, we derive error bounds that converge to zero if the state approaches the desired equilibrium. Further, we show that, if the underlying system is cost controllable, then this stabilizablility property is preserved. We conduct numerical simulations, which illustrate the proven practical asymptotic stability.
2.Threshold-aware Learning to Generate Feasible Solutions for Mixed Integer Programs
Authors:Taehyun Yoon, Jinwon Choi, Hyokun Yun, Sungbin Lim
Abstract: Finding a high-quality feasible solution to a combinatorial optimization (CO) problem in a limited time is challenging due to its discrete nature. Recently, there has been an increasing number of machine learning (ML) methods for addressing CO problems. Neural diving (ND) is one of the learning-based approaches to generating partial discrete variable assignments in Mixed Integer Programs (MIP), a framework for modeling CO problems. However, a major drawback of ND is a large discrepancy between the ML and MIP objectives, i.e., variable value classification accuracy over primal bound. Our study investigates that a specific range of variable assignment rates (coverage) yields high-quality feasible solutions, where we suggest optimizing the coverage bridges the gap between the learning and MIP objectives. Consequently, we introduce a post-hoc method and a learning-based approach for optimizing the coverage. A key idea of our approach is to jointly learn to restrict the coverage search space and to predict the coverage in the learned search space. Experimental results demonstrate that learning a deep neural network to estimate the coverage for finding high-quality feasible solutions achieves state-of-the-art performance in NeurIPS ML4CO datasets. In particular, our method shows outstanding performance in the workload apportionment dataset, achieving the optimality gap of 0.45%, a ten-fold improvement over SCIP within the one-minute time limit.
3.Linear-Quadratic Optimal Control Problem for Mean-Field Stochastic Differential Equations with a Type of Random Coefficients
Authors:Hongwei Mei, Qingmeng Wei, Jiongmin Yong
Abstract: Motivated by linear-quadratic optimal control problems (LQ problems, for short) for mean-field stochastic differential equations (SDEs, for short) with the coefficients containing regime switching governed by a Markov chain, we consider an LQ problem for an SDE with the coefficients being adapted to a filtration independent of the Brownian motion driving the control system. Classical approach of completing the square is applied to the current problem and obvious shortcomings are indicated. Open-loop and closed-loop solvability are introduced and characterized.
4.An Efficient Algorithm for Computational Protein Design Problem
Authors:Yukai Zheng, Weikun Chen, Qingna Li
Abstract: A protein is a sequence of basic blocks called amino acids, and it plays an important role in animals and human beings. The computational protein design (CPD) problem is to identify a protein that could perform some given functions. The CPD problem can be formulated as a quadratic semi-assigement problem (QSAP) and is extremely challenging due to its combinatorial properties over different amino acid sequences. In this paper, we first show that the QSAP is equivalent to its continuous relaxation problem, the RQSAP, in the sense that the QSAP and RQSAP share the same optimal solution. Then we design an efficient quadratic penalty method to solve large-scale RQSAP. Numerical results on benchmark instances verify the superior performance of our approach over the state-of-the-art branch-and-cut solvers. In particular, our proposed algorithm outperforms the state-of-the-art solvers by three order of magnitude in CPU time in most cases while returns a high-quality solution.
5.Maneuvering tracking algorithm for reentry vehicles with guaranteed prescribed performance
Authors:Zongyi Guo, Xiyu Gu, Yonglin Han, Jianguo Guo, Thomas Berger
Abstract: This paper presents a prescribed performance-based tracking control strategy for the atmospheric reentry flight of space vehicles subject to rapid maneuvers during flight mission. A time-triggered non-monotonic performance funnel is proposed with the aim of constraints violation avoidance in the case of sudden changes of the reference trajectory. Compared with traditional prescribed performance control methods, the novel funnel boundary is adaptive with respect to the reference path and is capable of achieving stability under disturbances. A recursive control structure is introduced which does not require any knowledge of specific system parameters. By a stability analysis we show that the tracking error evolves within the prescribed error margin under a condition which represents a trade-off between the reference signal and the performance funnel. The effectiveness of the proposed control scheme is verified by simulations.
6.Adaptive Methods or Variational Inequalities with Relatively Smooth and Reletively Strongly Monotone Operators
Authors:S. S. Ablaev, F. S. Stonyakin, M. S. Alkousa, D. A. Pasechnyuk
Abstract: The article is devoted to some adaptive methods for variational inequalities with relatively smooth and relatively strongly monotone operators. Starting from the recently proposed proximal variant of the extragradient method for this class of problems, we investigate in detail the method with adaptively selected parameter values. An estimate of the convergence rate of this method is proved. The result is generalized to a class of variational inequalities with relatively strongly monotone generalized smooth variational inequality operators. Numerical experiments have been performed for the problem of ridge regression and variational inequality associated with box-simplex games.
7.Robust Railway Network Design based on Strategic Timetables
Authors:Tim Sander, Nadine Friesen, Karl Nachtigall, Nils Nießen
Abstract: Using strategic timetables as input for railway network design has become increasingly popular among western European railway infrastructure operators. Although both railway timetabling and railway network design on their own are well covered by academic research, there is still a gap in the literature concerning timetable-based network design. Therefore, we propose a mixed-integer linear program to design railway infrastructure so that the demand derived from a strategic timetable can be satisfied with minimal infrastructure costs. The demand is given by a list of trains, each featuring start and destination nodes as well as time bounds and a set of frequency and transfer constraints that capture the strategic timetable's main characteristics. During the optimization, the solver decides which railway lines need to be built or expanded and whether travel or headway times must be shortened to meet the demand. Since strategic timetables are subject to uncertainty, we expand the optimization model to a robust version. Uncertain timetables are modelled as discrete scenarios, while uncertain freight train demand is modelled using optional trains, which can be inserted into the resulting timetable if they do not require additional infrastructure. We present computational results for both the deterministic and the robust case and give an outlook on further research.
8.On damping a control system with global aftereffect on quantum graphs
Authors:Sergey Buterin
Abstract: This paper naturally connects the theory of quantum graphs, the control theory and the theory of functional-differential equations. Specifically, we study the problem of damping a control system described by first-order equations on an arbitrary tree graph with global delay. The latter means that the constant delay imposed starting from the initial moment of time propagates through all internal vertices of the graph. By minimizing the energy functional, we arrive at the corresponding variational problem and then prove its equivalence to a self-adjoint boundary value problem on the tree for second-order equations involving both the global delay and the global advance. It is remarkable that the resulting problem acquires Kirchhoff's conditions at the internal vertices of the graph, which often appear in the theory of quantum graphs as well as various applications. The unique solvability of this boundary value problem is proved.
9.On the properties of the linear conjugate gradient method
Authors:Zexian Liu, Qiao Li
Abstract: The linear conjugate gradient method is an efficient iterative method for the convex quadratic minimization problems $ \mathop {\min }\limits_{x \in { \mathbb R^n}} f(x) =\dfrac{1}{2}x^TAx+b^Tx $, where $ A \in R^{n \times n} $ is symmetric and positive definite and $ b \in R^n $. It is generally agreed that the gradients $ g_k $ are not conjugate with respective to $ A $ in the linear conjugate gradient method (see page 111 in Numerical optimization (2nd, Springer, 2006) by Nocedal and Wright). In the paper we prove the conjugacy of the gradients $ g_k $ generated by the linear conjugate gradient method, namely, $$g_k^TAg_i=0, \; i=0,1,\cdots, k-2.$$ In addition,a new way is exploited to derive the linear conjugate gradient method based on the conjugacy of the search directions and the orthogonality of the gradients, rather than the conjugacy of the search directions and the exact stepsize.
10.Hierarchical Space Exploration Campaign Schedule Optimization With Ambiguous Programmatic Requirements
Authors:Nick Gollins, Koki Ho
Abstract: Space exploration plans are becoming increasingly complex as public agencies and private companies target deep-space locations, such as cislunar space and beyond, which require long-duration missions and many supporting systems and payloads. Optimizing multi-mission exploration campaigns is challenging due to the large number of required launches as well as their sequencing and compatibility requirements, making the conventional space logistics formulations not scalable. To tackle this challenge, this paper proposes an alternative approach that leverages a two-level hierarchical optimization algorithm: an evolutionary algorithm is used to explore the campaign scheduling solution space, and each of the solutions is then evaluated using a time-expanded multi-commodity flow mixed-integer linear program. A number of case studies, focusing on the Artemis lunar exploration program, demonstrate how the method can be used to analyze potential campaign architectures. The method enables a potential mission planner to study the sensitivity of a campaign to program-level parameters such as logistics vehicle availability and performance, payload launch windows, and in-situ resource utilization infrastructure efficiency.
11.Krylov Solvers for Interior Point Methods with Applications in Radiation Therapy
Authors:Felix Liu, Albin Fredriksson, Stefano Markidis
Abstract: Interior point methods are widely used for different types of mathematical optimization problems. Many implementations of interior point methods in use today rely on direct linear solvers to solve systems of equations in each iteration. The need to solve ever larger optimization problems more efficiently and the rise of hardware accelerators for general purpose computing has led to a large interest in using iterative linear solvers instead, with the major issue being inevitable ill-conditioning of the linear systems arising as the optimization progresses. We investigate the use of Krylov solvers for interior point methods in solving optimization problems from radiation therapy. We implement a prototype interior point method using a so called doubly augmented formulation of the Karush-Kuhn-Tucker (KKT) linear system of equations, originally proposed by Forsgren and Gill, and evaluate its performance on real optimization problems from radiation therapy. Crucially, our implementation uses a preconditioned conjugate gradient method with Jacobi preconditioning internally. Our measurements of the conditioning of the linear systems indicate that the Jacobi preconditioner improves the conditioning of the systems to a degree that they can be solved iteratively, but there is room for further improvement in that regard. Furthermore, profiling of our prototype code shows that it is suitable for GPU acceleration, which may further improve its performance in practice. Overall, our results indicate that our method can find solutions of acceptable accuracy in reasonable time, even with a simple Jacobi preconditioner.
12.Increasing Supply Chain Resiliency Through Equilibrium Pricing and Stipulating Transportation Quota Regulation
Authors:Mostafa Pazoki, Hamed Samarghandi, Mehdi Behroozi
Abstract: Supply chain disruption can occur for a variety of reasons, including natural disasters or market dynamics. If the disruption is profound and with dire consequences for the economy, the regulators may decide to intervene to minimize the impact for the betterment of the society. This paper investigates the minimum quota regulation on transportation amounts, stipulated by the government in a market where transportation capacity is below total production and profitability levels differ significantly among different products. In North America, an interesting example can happen in rail transportation market, where the rail capacity is used for a variety of products and commodities such as oil and grains. This research assumes that there is a shipping company with limited capacity which will ship a group of products with heterogeneous transportation and production costs and prices. Mathematical problems for the market players as well as the government are presented, solutions are proposed, and implemented in a framed Canadian case study. Subsequently, the conditions that justify government intervention are identified, and an algorithm to obtain the optimum minimum quota is presented.
1.Multiobjective optimization approach to shape and topology optimization of plane trusses with various aspect ratios
Authors:Makoto Ohsaki, Saku Aoyagi, Kazuki Hayashi
Abstract: A multiobjective optimization method is proposed for obtaining the optimal plane trusses simultaneously for various aspect ratios of the initial ground structure as a set of Pareto optimal solutions generated through a single optimization process. The shape and topology are optimized simultaneously to minimize the compliance under constraint on the total structural volume. The strain energy of each member is divided into components of two coordinate directions on the plane. The force density method is used for alleviating difficulties due to existence of coalescent or melting nodes. It is shown in the numerical example that sufficiently accurate optimal solutions are obtained by comparison with those obtained by the linear weighted sum approach that requires solving a single-objective optimization problem many times.
2.Thermo-mechanical level-set topology optimization of a load carrying battery pack for electric aircraft
Authors:Alexandre T. R. Guibert, Murtaza Bookwala, Ashley Cronk, Y. Shirley Meng, H. Alicia Kim
Abstract: A persistent challenge with the development of electric vertical take-off and landing vehicles (eVTOL) to meet flight power and energy demands is the mass of the load and thermal management systems for batteries. One possible strategy to overcome this problem is to employ optimization techniques to obtain a lightweight battery pack while satisfying structural and thermal requirements. In this work, a structural battery pack with high-energy-density cylindrical cells is optimized using the level-set topology optimization method. The heat generated by the batteries is predicted using a high-fidelity electrochemical model for a given eVTOL flight profile. The worst-case scenario for the battery's heat generation is then considered as a source term in the weakly coupled steady-state thermomechanical finite element model used for optimization. The objective of the optimization problem is to minimize the weighted sum of thermal compliance and structural compliance subjected to a volume constraint. The methodology is demonstrated with numerical examples for different sets of weights. The optimized results due to different weights are compared, discussed, and evaluated with thermal and structural performance indicators. The optimized pack topologies are subjected to a transient thermal finite element analysis to assess the battery pack's thermal response.
3.Cooperative Multi-Agent Constrained POMDPs: Strong Duality and Primal-Dual Reinforcement Learning with Approximate Information States
Authors:Nouman Khan, Vijay Subramanian
Abstract: We study the problem of decentralized constrained POMDPs in a team-setting where the multiple non-strategic agents have asymmetric information. Strong duality is established for the setting of infinite-horizon expected total discounted costs when the observations lie in a countable space, the actions are chosen from a finite space, and the immediate cost functions are bounded. Following this, connections with the common-information and approximate information-state approaches are established. The approximate information-states are characterized independent of the Lagrange-multipliers vector so that adaptations of the multiplier (during learning) will not necessitate new representations. Finally, a primal-dual multi-agent reinforcement learning (MARL) framework based on centralized training distributed execution (CTDE) and three time-scale stochastic approximation is developed with the aid of recurrent and feedforward neural-networks as function-approximators.
4.Line Search for Convex Minimization
Authors:Laurent Orseau, Marcus Hutter
Abstract: Golden-section search and bisection search are the two main principled algorithms for 1d minimization of quasiconvex (unimodal) functions. The first one only uses function queries, while the second one also uses gradient queries. Other algorithms exist under much stronger assumptions, such as Newton's method. However, to the best of our knowledge, there is no principled exact line search algorithm for general convex functions -- including piecewise-linear and max-compositions of convex functions -- that takes advantage of convexity. We propose two such algorithms: $\Delta$-Bisection is a variant of bisection search that uses (sub)gradient information and convexity to speed up convergence, while $\Delta$-Secant is a variant of golden-section search and uses only function queries. While bisection search reduces the $x$ interval by a factor 2 at every iteration, $\Delta$-Bisection reduces the (sometimes much) smaller $x^*$-gap $\Delta^x$ (the $x$ coordinates of $\Delta$) by at least a factor 2 at every iteration. Similarly, $\Delta$-Secant also reduces the $x^*$-gap by at least a factor 2 every second function query. Moreover, the $y^*$-gap $\Delta^y$ (the $y$ coordinates of $\Delta$) also provides a refined stopping criterion, which can also be used with other algorithms. Experiments on a few convex functions confirm that our algorithms are always faster than their quasiconvex counterparts, often by more than a factor 2. We further design a quasi-exact line search algorithm based on $\Delta$-Secant. It can be used with gradient descent as a replacement for backtracking line search, for which some parameters can be finicky to tune -- and we provide examples to this effect, on strongly-convex and smooth functions. We provide convergence guarantees, and confirm the efficiency of quasi-exact line search on a few single- and multivariate convex functions.
5.Differentially Private and Communication-Efficient Distributed Nonconvex Optimization Algorithms
Authors:Antai Xie, Xinlei Yi, Xiaofan Wang, Ming Cao, Xiaoqiang Ren
Abstract: This paper studies the privacy-preserving distributed optimization problem under limited communication, where each agent aims to keep its cost function private while minimizing the sum of all agents' cost functions. To this end, we propose two differentially private distributed algorithms under compressed communication. We show that the proposed algorithms achieve sublinear convergence for smooth (possibly nonconvex) cost functions and linear convergence when the global cost function additionally satisfies the Polyak--Lojasiewicz condition, even for a general class of compressors with bounded relative compression error. Furthermore, we rigorously prove that the proposed algorithms ensure $\epsilon$-differential privacy. Noting that the definition of $\epsilon$-differential privacy is stricter than the definition of ($\epsilon$, $\delta$)-differential privacy used in the literature. Simulations are presented to demonstrate the effectiveness of our proposed approach.
6.Learning-based Improvement in State Estimation for Unobservable Systems
Authors:J. G. De la Varga, S. Pineda, J. M. Morales, Á. Porras
Abstract: The task of state estimation faces a major challenge due to the inherent lack of real-time observability, as certain measurements can only be acquired with a delay. As a result, power systems are essentially unobservable in real time, indicating the existence of multiple states that result in identical values for the available measurements. Certain existing approaches utilize historical data to infer the relationship between real-time available measurements and the state. Other learning-based methods aim at generating the pseudo-measurements required to make the system observable. Our paper presents a methodology that utilizes the outcome of an unobservable state estimator to exploit information on the joint probability distribution between real-time available measurements and delayed ones. Through numerical simulations conducted on a realistic electricity network with insufficient real-time measurements, the proposed procedure showcases superior performance compared to existing state forecasting approaches and those relying on inferred pseudo-measurements.
7.Accelerating Optimal Power Flow with GPUs: SIMD Abstraction of Nonlinear Programs and Condensed-Space Interior-Point Methods
Authors:Sungho Shin, François Pacaud, Mihai Anitescu
Abstract: This paper introduces a novel computational framework for solving alternating current optimal power flow (ACOPF) problems using graphics processing units (GPUs). While GPUs have demonstrated remarkable performance in various computing domains, their application in AC OPF has been limited due to challenges associated with porting sparse automatic differentiation (AD) and sparse linear solver routines to GPUs. We aim to address these issues with two key strategies. First, we utilize a single-instruction, multiple-data (SIMD) abstraction of nonlinear programs (NLP). This approach enables the specification of model equations while preserving their parallelizable structure, and in turn, facilitates the implementation of AD routines that can exploit such structure. Second, we employ a condensed-space interior-point method (IPM) with an inequality relaxation strategy. This technique involves relaxing equality constraints to inequalities and condensing the Karush-Kuhn-Tucker system into a much smaller positive definite system. This strategy offers the key advantage of being able to factorize the KKT matrix without numerical pivoting, which in the past has hampered the parallelization of the IPM algorithm. By combining these two strategies, we can perform the majority of operations on GPUs while keeping the data residing in the device memory only. Comprehensive numerical benchmark results showcase the substantial computational advantage of our approach. Remarkably, for solving large-scale AC OPF problems to a moderate accuracy, our implementations -- MadNLP.jl and ExaModels.jl -- running on NVIDIA GPUs achieve an order of magnitude speedup compared to state-of-the-art tools running on contemporary CPUs.
8.Multi-year Investment Modelling in Energy Systems
Authors:Diego A. Tejada-Arango
Abstract: This paper summarises the main multi-year investment modelling approaches in energy planning models. Therefore, here we will go from a simple (basic) formulation to a more complex (general) one to understand different levels of detail, including examples to make more accessible the understanding of the concepts.
1.Modeling Nonlinear Control Systems via Koopman Control Family: Universal Forms and Subspace Invariance Proximity
Authors:Masih Haseli, Jorge Cortés
Abstract: This paper introduces the Koopman Control Family (KCF), a mathematical framework for modeling general discrete-time nonlinear control systems with the aim of providing a solid theoretical foundation for the use of Koopman-based methods in systems with inputs. We demonstrate that the concept of KCF can completely capture the behavior of nonlinear control systems on a (potentially infinite-dimensional) function space. By employing a generalized notion of subspace invariance under the KCF, we establish a universal form for finite-dimensional models, which encompasses the commonly used linear, bilinear, and linear switched models as specific instances. In cases where the subspace is not invariant under the KCF, we propose a method for approximating models in general form and characterize the model's accuracy using the concept of invariance proximity. The proposed framework naturally lends itself to the incorporation of data-driven methods in modeling and control.
2.On new generalized differentials with respect to a set and their applications
Authors:Xiaolong Qin, Vo Duc Thinh, Jen-Chih Yao
Abstract: The notions and certain fundamental characteristics of the proximal and limiting normal cones with respect to a set are first presented in this paper. We present the ideas of the limiting coderivative and subdifferential with respect to a set of multifunctions and singleton mappings, respectively, based on these normal cones. The necessary and sufficient conditions for the Aubin property with respect to a set of multifunctions are then described by using the limiting coderivative with respect to a set. As a result of the limiting subdifferential with respect to a set, we offer the requisite optimality criteria for local solutions to optimization problems. In addition, we also provide examples to demonstrate the outcomes.
3.Minimal error momentum Bregman-Kaczmarz
Authors:Dirk A. Lorenz, Maximilian Winkler
Abstract: The Bregman-Kaczmarz method is an iterative method which can solve strongly convex problems with linear constraints and uses only one or a selected number of rows of the system matrix in each iteration, thereby making it amenable for large-scale systems. To speed up convergence, we investigate acceleration by heavy ball momentum in the so-called dual update. Heavy ball acceleration of the Kaczmarz method with constant parameters has turned out to be difficult to analyze, in particular no accelerated convergence for the L2-error of the iterates has been proven to the best of our knowledge. Here we propose a way to adaptively choose the momentum parameter by a minimal-error principle similar to a recently proposed method for the standard randomized Kaczmarz method. The momentum parameter can be chosen to exactly minimize the error in the next iterate or to minimize a relaxed version of the minimal error principle. The former choice leads to a theoretically optimal step while the latter is cheaper to compute. We prove improved convergence results compared to the non-accelerated method. Numerical experiments show that the proposed methods can accelerate convergence in practice, also for matrices which arise from applications such as computational tomography.
4.Symmetric separable convex resource allocation problems with structured disjoint interval bound constraints
Authors:Martijn H. H. Schoot Uiterkamp
Abstract: Motivated by the problem of scheduling electric vehicle (EV) charging with a minimum charging threshold in smart distribution grids, we introduce the resource allocation problem (RAP) with a symmetric separable convex objective function and disjoint interval bound constraints. In this RAP, the aim is to allocate an amount of resource over a set of $n$ activities, where each individual allocation is restricted to a disjoint collection of $m$ intervals. This is a generalization of classical RAPs studied in the literature where in contrast each allocation is only restricted by simple lower and upper bounds, i.e., $m=1$. We propose an exact algorithm that, for four special cases of the problem, returns an optimal solution in $O \left(\binom{n+m-2}{m-2} (n \log n + nF) \right)$ time, where the term $nF$ represents the number of flops required for one evaluation of the separable objective function. In particular, the algorithm runs in polynomial time when the number of intervals $m$ is fixed. Moreover, we show how this algorithm can be adapted to also output an optimal solution to the problem with integer variables without increasing its time complexity. Computational experiments demonstrate the practical efficiency of the algorithm for small values of $m$ and in particular for solving EV charging problems.
5.Convex Optimization of PV-Battery System Sizing and Operation with Non-Linear Loss Models
Authors:Jolien Despeghel, Jeroen Tant, Johan Driesen
Abstract: In the literature, when optimizing the sizing and operation of a residential PV system in combination with a battery energy storage system, the efficiency of the battery and the converter is generally assumed constant, which corresponds to a linear loss model that can be readily integrated in an optimization model. However, this assumption does not always represent the impact of the losses accurately. For this reason, an approach is presented that includes non-linear converter and battery loss models by applying convex relaxations to the non-linear constraints. The relaxed convex formulation is equivalent to the original non-linear formulation and can be solved more efficiently. The difference between the optimization model with non-linear loss models and linear loss models is illustrated for a residential DC-coupled PV-battery system. The linear loss model is shown to result in an underestimation of the battery size and cost as well as a lower utilization of the battery. The proposed method is useful to accurately model the impact of losses on the optimal sizing and operation in exchange for a slightly higher computational time compared to linear loss models, though far below that of solving the non-relaxed non-linear problem.
6.Nonlinear conjugate gradient method for vector optimization on Riemannian manifolds with retraction and vector transport
Authors:Kangming Chen, Ellen H. Fukuda, Hiroyuki Sato
Abstract: In this paper, we propose nonlinear conjugate gradient methods for vector optimization on Riemannian manifolds. The concepts of Wolfe and Zoutendjik conditions are extended for Riemannian manifolds. Specifically, we establish the existence of intervals of step sizes that satisfy the Wolfe conditions. The convergence analysis covers the vector extensions of the Fletcher--Reeves, conjugate descent, and Dai--Yuan parameters. Under some assumptions, we prove that the sequence obtained by the algorithm can converge to a Pareto stationary point. Moreover, we also discuss several other choices of the parameter. Numerical experiments illustrating the practical behavior of the methods are presented.
7.Be greedy and learn: efficient and certified algorithms for parametrized optimal control problems
Authors:Hendrik Kleikamp, Martin Lazar, Cesare Molinari
Abstract: We consider parametrized linear-quadratic optimal control problems and provide their online-efficient solutions by combining greedy reduced basis methods and machine learning algorithms. To this end, we first extend the greedy control algorithm, which builds a reduced basis for the manifold of optimal final time adjoint states, to the setting where the objective functional consists of a penalty term measuring the deviation from a desired state and a term describing the control energy. Afterwards, we apply machine learning surrogates to accelerate the online evaluation of the reduced model. The error estimates proven for the greedy procedure are further transferred to the machine learning models and thus allow for efficient a posteriori error certification. We discuss the computational costs of all considered methods in detail and show by means of two numerical examples the tremendous potential of the proposed methodology.
8.Inexact proximal methods for weakly convex functions
Authors:Pham Duy Khanh, Boris Mordukhovich, Dat Ba Tran
Abstract: This paper proposes and develops inexact proximal methods for finding stationary points of the sum of a smooth function and a nonsmooth weakly convex one, where an error is present in the calculation of the proximal mapping of the nonsmooth term. A general framework for finding zeros of a continuous mapping is derived from our previous paper on this subject to establish convergence properties of the inexact proximal point method when the smooth term is vanished and of the inexact proximal gradient method when the smooth term satisfies a descent condition. The inexact proximal point method achieves global convergence with constructive convergence rates when the Moreau envelope of the objective function satisfies the Kurdyka-Lojasiewicz (KL) property. Meanwhile, when the smooth term is twice continuously differentiable with a Lipschitz continuous gradient and a differentiable approximation of the objective function satisfies the KL property, the inexact proximal gradient method achieves the global convergence of iterates with constructive convergence rates.
9.$\ell_p$-sphere covering and approximating nuclear $p$-norm
Authors:Jiewen Guan, Simai He, Bo Jiang, Zhening Li
Abstract: The spectral $p$-norm and nuclear $p$-norm of matrices and tensors appear in various applications albeit both are NP-hard to compute. The former sets a foundation of $\ell_p$-sphere constrained polynomial optimization problems and the latter has been found in many rank minimization problems in machine learning. We study approximation algorithms of the tensor nuclear $p$-norm with an aim to establish the approximation bound matching the best one of its dual norm, the tensor spectral $p$-norm. Driven by the application of sphere covering to approximate both tensor spectral and nuclear norms ($p=2$), we propose several types of hitting sets that approximately represent $\ell_p$-sphere with adjustable parameters for different levels of approximations and cardinalities, providing an independent toolbox for decision making on $\ell_p$-spheres. Using the idea in robust optimization and second-order cone programming, we obtain the first polynomial-time algorithm with an $\Omega(1)$-approximation bound for the computation of the matrix nuclear $p$-norm when $p\in(2,\infty)$ is a rational, paving a way for applications in modeling with the matrix nuclear $p$-norm. These two new results enable us to propose various polynomial-time approximation algorithms for the computation of the tensor nuclear $p$-norm using tensor partitions, convex optimization and duality theory, attaining the same approximation bound to the best one of the tensor spectral $p$-norm. We believe the ideas of $\ell_p$-sphere covering with its applications in approximating nuclear $p$-norm would be useful to tackle optimization problems on other sets such as the binary hypercube with its applications in graph theory and neural networks, the nonnegative sphere with its applications in copositive programming and nonnegative matrix factorization.
10.Convergence of Augmented Lagrangian Methods for Composite Optimization Problems
Authors:Nguyen T. V. Hang, Ebrahim Sarabi
Abstract: Local convergence analysis of the augmented Lagrangian method (ALM) is established for a large class of composite optimization problems with nonunique Lagrange multipliers under a second-order sufficient condition. We present a new second-order variational property, called the semi-stability of second subderivatives, and demonstrate that it is widely satisfied for numerous classes of functions, important for applications in constrained and composite optimization problems. Using the latter condition and a certain second-order sufficient condition, we are able to establish Q-linear convergence of the primal-dual sequence for an inexact version of the ALM for composite programs.
1.Adjoint-based optimal control of contractile elastic bodies. Application to limbless locomotion on frictional substrates
Authors:Ashutosh Bijalwan, Jose J Munoz
Abstract: In nature, limbless locomotion is adopted by a wide range of organisms at various length scales. Interestingly, undulatory, crawling and inching/looping gait constitutes a fundamental class of limbless locomotion and is often observed in many species such as caterpillars, earthworms, leeches, larvae, and \emph{C. elegans}, to name a few. In this work, we developed a computationally efficient 3D Finite Element (FE) based unified framework for the locomotion of limbless organisms on soft substrates. Muscle activity is simulated with a multiplicative decomposition of deformation gradient, which allows mimicking a broad range of locomotion patterns in 3D solids on frictional substrates. In particular, a two-field FE formulation based on positions and velocities is proposed. Governing partial differential equations are transformed into equivalent time-continuous differential-algebraic equations (DAEs). Next, the optimal locomotion strategies are studied in the framework of optimal control theory. We resort to adjoint-based methods and deduce the first-order optimality conditions, that yield a system of DAEs with two-point end conditions. Hidden symplectic structure and Symplectic Euler time integration of optimality conditions have been discussed. The resulting discrete first-order optimality conditions form a non-linear programming problem that is solved efficiently with the Forward Backwards Sweep Method. Finally, some numerical examples are provided to demonstrate the comprehensiveness of the proposed computational framework and investigate the energy-efficient optimal limbless locomotion strategy out of distinct locomotion patterns adopted by limbless organisms.
2.Optimality of Split Covariance Intersection Fusion
Authors:Colin Cros, Pierre-Olivier Amblard, Christophe Prieur, Jean-François Da Rocha
Abstract: Linear fusion is a cornerstone of estimation theory. Optimal linear fusion was derived by Bar-Shalom and Campo in the 1980s. It requires knowledge of the cross-covariances between the errors of the estimators. In distributed or cooperative systems, these cross-covariances are difficult to compute. To avoid an underestimation of the errors when these cross-covariances are unknown, conservative fusions must be performed. A conservative fusion provides a fused estimator with a covariance bound which is guaranteed to be larger than the true (but not computable) covariance of the error. Previous research by Reinhardt et al. proved that, if no additional assumption is made about the errors of the estimators, the minimal bound for fusing two estimators is given by a fusion called Covariance Intersection (CI). In practice, the errors of the estimators often have an uncorrelated component, because the dynamic or measurement noise is assumed to be independent. In this context, CI is no longer the optimal method and an adaptation called Split Covariance Intersection (SCI) has been designed to take advantage from these uncorrelated components. The contribution of this paper is to prove that SCI is the optimal fusion rule for two estimators under the assumption that they have an uncorrelated component. It is proved that SCI provides the optimal covariance bound with respect to any increasing cost function. To prove the result, a minimal volume that should contain all conservative bounds is derived, and the SCI bounds are proved to be the only bounds that tightly circumscribe this minimal volume.
3.A Variance-Reduced Aggregation Based Gradient Tracking method for Distributed Optimization over Directed Networks
Authors:Shengchao Zhao, Siyuan Song, Yongchao Liu
Abstract: This paper studies the distributed optimization problem over directed networks with noisy information-sharing. To resolve the imperfect communication issue over directed networks, a series of noise-robust variants of Push-Pull/AB method have been developed. These methods improve the robustness of Push-Pull method against the information-sharing noise through adding small factors on weight matrices and replacing the global gradient tracking with the cumulative gradient tracking. Based on the two techniques, we propose a new variant of the Push-Pull method by presenting a novel mechanism of inter-agent information aggregation, named variance-reduced aggregation (VRA). VRA helps us to release some conditions on the objective function and networks. When the objective function is convex and the sharing-information noise is variance-unbounded, it can be shown that the proposed method converges to the optimal solution almost surely. When the objective function is strongly convex and the sharing-information noise is variance-bounded, the proposed method achieves the convergence rate of $\mathcal{O}\left(k^{-(1-\epsilon)}\right)$ in the mean square sense, where $\epsilon$ could be close to 0 infinitely. Simulated experiments on ridge regression problems verify the effectiveness of the proposed method.
4.On the robustness of networks of heterogeneous semi-passive systems interconnected over directed graphs
Authors:Anes Lazri, Elena Panteley, Antonio Loria
Abstract: In this short note we provide a proof of boundedness of solutions for a network system composed of heterogeneous nonlinear autonomous systems interconnected over a directed graph. The sole assumptions imposed are that the systems are semi-passive [1] and the graph contains a spanning tree.
5.Feedback and Open-Loop Nash Equilibria for LQ Infinite-Horizon Discrete-Time Dynamic Games
Authors:A. Monti, B. Nortmann, T. Mylvaganam, M. Sassano
Abstract: We consider dynamic games defined over an infinite horizon, characterized by linear, discrete-time dynamics and quadratic cost functionals. Considering such linear-quadratic (LQ) dynamic games, we focus on their solutions in terms Nash equilibrium strategies. Both Feedback (F-NE) and Open-Loop (OL-NE) Nash equilibrium solutions are considered. The contributions of the paper are threefold. First, our detailed study reveals some interesting structural insights in relation to F-NE solutions. Second, as a stepping stone towards our consideration of OL-NE strategies, we consider a specific infinite-horizon discrete-time (single-player) optimal control problem, wherein the dynamics are influenced by a known exogenous input and draw connections between its solution obtained via Dynamic Programming and Pontryagin's Minimum Principle. Finally, we exploit the latter result to provide a characterization of OL-NE strategies of the class of infinite-horizon dynamic games. The results and key observations made throughout the paper are illustrated via a numerical example.
6.A Stochastic Gradient Tracking Algorithm for Decentralized Optimization With Inexact Communication
Authors:Suhail M. Shah, Raghu Bollapragada
Abstract: Decentralized optimization is typically studied under the assumption of noise-free transmission. However, real-world scenarios often involve the presence of noise due to factors such as additive white Gaussian noise channels or probabilistic quantization of transmitted data. These sources of noise have the potential to degrade the performance of decentralized optimization algorithms if not effectively addressed. In this paper, we focus on the noisy communication setting and propose an algorithm that bridges the performance gap caused by communication noise while also mitigating other challenges like data heterogeneity. We establish theoretical results of the proposed algorithm that quantify the effect of communication noise and gradient noise on the performance of the algorithm. Notably, our algorithm achieves the optimal convergence rate for minimizing strongly convex, smooth functions in the context of inexact communication and stochastic gradients. Finally, we illustrate the superior performance of the proposed algorithm compared to its state-of-the-art counterparts on machine learning problems using MNIST and CIFAR-10 datasets.
1.Efficient Algorithm for QCQP problem with Multiple Quadratic Constraints
Authors:Huang Yin
Abstract: Starting from a classic financial optimization problem, we first propose a cutting plane algorithm for this problem. Then we use spectral decomposition to tranform the problem into an equivalent D.C. programming problem, and the corresponding upper bound estimate is given by the SCO algorithm; then the corresponding lower bound convex relaxation is given by McCormick envelope. Based on this, we propose a global algorithm for this problem and establish the convergence of the algorithms. What's more, the algorithm is still valid for QCQP with multiple quadratic constraints and quadratic matrix in general form.
2.Stabilization of uncertain linear dynamics: an offline-online strategy
Authors:Philipp A. Guth, Karl Kunisch, Sérgio S. Rodrigues
Abstract: A strategy is proposed for adaptive stabilization of linear systems, depending on an uncertain parameter. Offline, the Riccati stabilizing feedback input control operators, corresponding to parameters in a finite training set of chosen candidates for the uncertain parameter, are solved and stored in a library. A uniform partition of the infinite time interval is chosen. In each of these subintervals, the input is given by one of the stored parameter dependent Riccati feedback operators. This parameter is updated online, at the end of each subinterval, based on input and output data, where the true data, corresponding to the true parameter, is compared to fictitious data that one would obtain in case the parameter was in a selected subset of the training set. The auxiliary data can be computed in parallel, so that the parameter update can be performed in real time. The focus is put on the case that the unknown parameter is constant and that the free dynamics is time-periodic. The stabilizing performance of the input obtained by the proposed strategy is illustrated by numerical simulations, for both constant and switching parameters.
3.Gradient-Type Method for Optimization Problems with Polyak-Lojasiewicz Condition: Relative Inexactness in Gradient and Adaptive Parameters Setting
Authors:Sergei M. Puchinin, Fedor S. Stonyakin
Abstract: We consider minimization problems with the well-known Polya-Lojasievich condition and Lipshitz-continuous gradient. Such problem occurs in different places in machine learning and related fields. Furthermore, we assume that a gradient is available with some relative inexactness. We propose some adaptive gradient-type algorithm, where the adaptivity took place with respect to the smoothness parameter and the level of the gradient inexactness. The theoretical estimate of the the quality of the output point is obtained and backed up by experimental results.
4.Improving Conflict Analysis in MIP Solvers by Pseudo-Boolean Reasoning
Authors:Gioni Mexi, Timo Berthold, Ambros Gleixner, Jakob Nordström
Abstract: Conflict analysis has been successfully generalized from Boolean satisfiability (SAT) solving to mixed integer programming (MIP) solvers, but although MIP solvers operate with general linear inequalities, the conflict analysis in MIP has been limited to reasoning with the more restricted class of clausal constraint. This is in contrast to how conflict analysis is performed in so-called pseudo-Boolean solving, where solvers can reason directly with 0-1 integer linear inequalities rather than with clausal constraints extracted from such inequalities. In this work, we investigate how pseudo-Boolean conflict analysis can be integrated in MIP solving, focusing on 0-1 integer linear programs (0-1 ILPs). Phrased in MIP terminology, conflict analysis can be understood as a sequence of linear combinations and cuts. We leverage this perspective to design a new conflict analysis algorithm based on mixed integer rounding (MIR) cuts, which theoretically dominates the state-of-the-art division-based method in pseudo-Boolean solving. We also report results from a first proof-of-concept implementation of different pseudo-Boolean conflict analysis methods in the open-source MIP solver SCIP. When evaluated on a large and diverse set of 0-1 ILP instances from MIPLIB 2017, our new MIR-based conflict analysis outperforms both previous pseudo-Boolean methods and the clause-based method used in MIP. Our conclusion is that pseudo-Boolean conflict analysis in MIP is a promising research direction that merits further study, and that it might also make sense to investigate the use of such conflict analysis to generate stronger no-goods in constraint programming.
5.Convex semi-infinite programming algorithms with inexact separation oracles
Authors:Antoine Oustry, Martina Cerulli
Abstract: Solving convex Semi-Infinite Programming (SIP) problems is challenging when the separation problem, i.e., the problem of finding the most violated constraint, is computationally hard. We propose to tackle this difficulty by solving the separation problem approximately, i.e., by using an inexact oracle. Our focus lies in two algorithms for SIP, namely the Cutting-Planes (CP) and the Inner-Outer Approximation (IOA) algorithms. We prove the CP convergence rate to be in O(1/k), where k is the number of calls to the limited-accuracy oracle, if the objective function is strongly convex. Compared to the CP algorithm, the advantage of the IOA algorithm is the feasibility of its iterates. In the case of a semi-infinite program with Quadratically Constrained Quadratic Programming separation problem, we prove the convergence of the IOA algorithm toward an optimal solution of the SIP problem despite the oracle's inexactness.
6.Optimisation and monotonicity of the second Robin eigenvalue on a planar exterior domain
Authors:David Krejcirik, Vladimir Lotoreichik
Abstract: We consider the Laplace operator in the exterior of a compact set in the plane, subject to Robin boundary conditions. If the boundary coupling is sufficiently negative, there are at least two discrete eigenvalues below the essential spectrum. We state a general conjecture that the second eigenvalue is maximised by the exterior of a disk under isochoric or isoperimetric constraints. We prove an isoelastic version of the conjecture for the exterior of convex domains. Finally, we establish a monotonicity result for the second eigenvalue under the condition that the compact set is strictly star-shaped and centrally symmetric.
7.Robust Regret Optimal Control
Authors:Jietian Liu, Peter Seiler
Abstract: This paper presents a synthesis method for robust, regret optimal control. The plant is modeled in discrete-time by an uncertain linear time-invariant (LTI) system. An optimal non-causal controller is constructed using the nominal plant model and given full knowledge of the disturbance. Robust regret is defined relative to the performance of this optimal non-causal control. It is shown that a controller achieves robust regret if and only if it satisfies a robust H-infinity performance condition. DK-iteration can be used to synthesize a controller that satisfies this condition and hence achieve a given level of robust regret. The approach is demonstrated via two examples.
8.Parameter-Free FISTA by Adaptive Restart and Backtracking
Authors:Jean-François Aujol, Luca Calatroni, Charles Dossal, Hippolyte Labarrière, Aude Rondepierre
Abstract: We consider a combined restarting and adaptive backtracking strategy for the popular Fast Iterative Shrinking-Thresholding Algorithm frequently employed for accelerating the convergence speed of large-scale structured convex optimization problems. Several variants of FISTA enjoy a provable linear convergence rate for the function values $F(x_n)$ of the form $\mathcal{O}( e^{-K\sqrt{\mu/L}~n})$ under the prior knowledge of problem conditioning, i.e. of the ratio between the (\L ojasiewicz) parameter $\mu$ determining the growth of the objective function and the Lipschitz constant $L$ of its smooth component. These parameters are nonetheless hard to estimate in many practical cases. Recent works address the problem by estimating either parameter via suitable adaptive strategies. In our work both parameters can be estimated at the same time by means of an algorithmic restarting scheme where, at each restart, a non-monotone estimation of $L$ is performed. For this scheme, theoretical convergence results are proved, showing that a $\mathcal{O}( e^{-K\sqrt{\mu/L}n})$ convergence speed can still be achieved along with quantitative estimates of the conditioning. The resulting Free-FISTA algorithm is therefore parameter-free. Several numerical results are reported to confirm the practical interest of its use in many exemplar problems.
1.Federated K-Means Clustering via Dual Decomposition-based Distributed Optimization
Authors:Vassilios Yfantis, Achim Wagner, Martin Ruskowski
Abstract: The use of distributed optimization in machine learning can be motivated either by the resulting preservation of privacy or the increase in computational efficiency. On the one hand, training data might be stored across multiple devices. Training a global model within a network where each node only has access to its confidential data requires the use of distributed algorithms. Even if the data is not confidential, sharing it might be prohibitive due to bandwidth limitations. On the other hand, the ever-increasing amount of available data leads to large-scale machine learning problems. By splitting the training process across multiple nodes its efficiency can be significantly increased. This paper aims to demonstrate how dual decomposition can be applied for distributed training of $ K $-means clustering problems. After an overview of distributed and federated machine learning, the mixed-integer quadratically constrained programming-based formulation of the $ K $-means clustering training problem is presented. The training can be performed in a distributed manner by splitting the data across different nodes and linking these nodes through consensus constraints. Finally, the performance of the subgradient method, the bundle trust method, and the quasi-Newton dual ascent algorithm are evaluated on a set of benchmark problems. While the mixed-integer programming-based formulation of the clustering problems suffers from weak integer relaxations, the presented approach can potentially be used to enable an efficient solution in the future, both in a central and distributed setting.
2.Finding the spectral radius of a nonnegative irreducible symmetric tensor via DC programming
Authors:Xueli Bai, Dong-Hui Li, Lei Wu, Jiefeng Xu
Abstract: The Perron-Frobenius theorem says that the spectral radius of an irreducible nonnegative tensor is the unique positive eigenvalue corresponding to a positive eigenvector. With this in mind, the purpose of this paper is to find the spectral radius and its corresponding positive eigenvector of an irreducible nonnegative symmetric tensor. By transferring the eigenvalue problem into an equivalent problem of minimizing a concave function on a closed convex set, which is typically a DC (difference of convex functions) programming, we derive a simpler and cheaper iterative method. The proposed method is well-defined. Furthermore, we show that both sequences of the eigenvalue estimates and the eigenvector evaluations generated by the method $Q$-linearly converge to the spectral radius and its corresponding eigenvector, respectively. To accelerate the method, we introduce a line search technique. The improved method retains the same convergence property as the original version. Preliminary numerical results show that the improved method performs quite well.
3.DecisionProgramming.jl --A framework for modelling decision problems using mathematical programming
Authors:Juho Andelmin, Jaan Tollander de Balsch, Helmi Hankimaa, Olli Herrala, Fabricio Oliveira
Abstract: We present DecisionProgramming.jl, a new Julia package for modelling decision problems as mixed-integer programming (MIP) equivalents. The package allows the user to pose decision problems as influence diagrams which are then automatically converted to an equivalent MIP formulation. This MIP formulation is implemented using JuMP.jl, a Julia package providing an algebraic syntax for formulating mathematical programming problems. In this paper, we show novel MIP formulations used in the package, which considerably improve the computational performance of the MIP solver. We also present a novel heuristic that can be employed to warm start the solution, as well as providing heuristic solutions to more computationally challenging problems. Lastly, we describe a novel case study showcasing decision programming as an alternative framework for modelling multi-stage stochastic dynamic programming problems.
4.Computational Guarantees for Doubly Entropic Wasserstein Barycenters via Damped Sinkhorn Iterations
Authors:Lénaïc Chizat, Tomas Vaškevičius
Abstract: We study the computation of doubly regularized Wasserstein barycenters, a recently introduced family of entropic barycenters governed by inner and outer regularization strengths. Previous research has demonstrated that various regularization parameter choices unify several notions of entropy-penalized barycenters while also revealing new ones, including a special case of debiased barycenters. In this paper, we propose and analyze an algorithm for computing doubly regularized Wasserstein barycenters. Our procedure builds on damped Sinkhorn iterations followed by exact maximization/minimization steps and guarantees convergence for any choice of regularization parameters. An inexact variant of our algorithm, implementable using approximate Monte Carlo sampling, offers the first non-asymptotic convergence guarantees for approximating Wasserstein barycenters between discrete point clouds in the free-support/grid-free setting.
5.A new Lagrangian approach to control affine systems with a quadratic Lagrange term
Authors:Sigrid Leyendecker, Sofya Maslovskaya, Sina Ober-Blobaum, Rodrigo T. Sato Martin de Almagro, Flora Orsolya Szemenyei
Abstract: In this work, we consider optimal control problems for mechanical systems on vector spaces with fixed initial and free final state and a quadratic Lagrange term. Specifically, the dynamics is described by a second order ODE containing an affine control term and we allow linear coordinate changes in the configuration space. Classically, Pontryagin's maximum principle gives necessary optimality conditions for the optimal control problem. For smooth problems, alternatively, a variational approach based on an augmented objective can be followed. Here, we propose a new Lagrangian approach leading to equivalent necessary optimality conditions in the form of Euler-Lagrange equations. Thus, the differential geometric structure (similar to classical Lagrangian dynamics) can be exploited in the framework of optimal control problems. In particular, the formulation enables the symplectic discretisation of the optimal control problem via variational integrators in a straightforward way.
6.Multiple Lyapunov Functions and Memory: A Symbolic Dynamics Approach to Systems and Control
Authors:Matteo Della Rossa, Raphaël M. Jungers
Abstract: We propose a novel framework for the Lyapunov analysis of a large class of hybrid systems, inspired by the theory of symbolic dynamics and earlier results on the restricted class of switched systems. This new framework allows us to leverage language theory tools in order to provide a universal characterization of Lyapunov stability for this class of systems. We establish, in particular, a formal connection between multiple Lyapunov functions and techniques based on memorization and/or prediction of the discrete part of the state. This allows us to provide an equivalent (single) Lyapunov function, for any given multiple-Lyapunov criterion. By leveraging our Language-theoretic formalism, a new class of stability conditions is then obtained when considering both memory and future values of the state in a joint fashion, providing new numerical schemes that outperform existing technique. Our techniques are then illustrated on numerical examples.
7.Assortment Optimization with Visibility Constraints
Authors:Theo Barre, Omar El Housni, Andrea Lodi
Abstract: Motivated by applications in e-retail and online advertising, we study the problem of assortment optimization under visibility constraints, that we refer to as APV. We are given a universe of substitutable products and a stream of T customers. The objective is to determine the optimal assortment of products to offer to each customer in order to maximize the total expected revenue, subject to the constraint that each product is required to be shown to a minimum number of customers. The minimum display requirement for each product is given exogenously and we refer to these constraints as visibility constraints. We assume that customer choices follow a Multinomial Logit model (MNL). We provide a characterization of the structure of the optimal assortments and present an efficient polynomial time algorithm for solving APV. To accomplish this, we introduce a novel function called the ``expanded revenue" of an assortment and establish its supermodularity. Our algorithm takes advantage of this structural property. Additionally, we demonstrate that APV can be formulated as a compact linear program. We also examine the revenue loss resulting from the enforcement of visibility constraints, comparing it to the unconstrained version of the problem. To offset this loss, we propose a novel strategy to distribute the loss among the products subject to visibility constraints. Each vendor is charged an amount proportional to their product's contribution to the revenue loss. Finally, we present the results of our numerical experiments providing illustration of the obtained outcomes, and we discuss some preliminary results on the extension of the problem to accommodate cardinality constraints.
8.Reduced Control Systems on Symmetric Lie Algebras
Authors:Emanuel Malvetti, Gunther Dirr, Frederik vom Ende, Thomas Schulte-Herbrüggen
Abstract: For a symmetric Lie algebra $\mathfrak g=\mathfrak k\oplus\mathfrak p$ we consider a class of bilinear or more general control-affine systems on $\mathfrak p$ defined by a drift vector field $X$ and control vector fields $\mathrm{ad}_{k_i}$ for $k_i\in\mathfrak k$ such that one has fast and full control on the corresponding compact group $\mathbf K$. We show that under quite general assumptions on $X$ such a control system is essentially equivalent to a natural reduced system on a maximal Abelian subspace $\mathfrak a\subseteq\mathfrak p$, and likewise to related differential inclusions defined on $\mathfrak a$. We derive a number of general results for such systems and as an application we prove a simulation result with respect to the preorder induced by the Weyl group action.
9.On structural contraction of biological interaction networks
Authors:M. Ali Al-Radhawi, David Angeli, Eduardo Sontag
Abstract: In previous work, we have developed an approach for characterizing the long-term dynamics of classes of Biological Interaction Networks (BINs), based on "rate-dependent Lyapunov functions". In this work, we show that stronger notions of convergence can be established by proving structural contractivity with respect to non-standard norms. We illustrate our theory with examples from signaling pathways.
1.Decentralized Optimization Over Slowly Time-Varying Graphs: Algorithms and Lower Bounds
Authors:Dmitry Metelev, Aleksandr Beznosikov, Alexander Rogozin, Alexander Gasnikov, Anton Proskurnikov
Abstract: We consider a decentralized convex unconstrained optimization problem, where the cost function can be decomposed into a sum of strongly convex and smooth functions, associated with individual agents, interacting over a static or time-varying network. Our main concern is the convergence rate of first-order optimization algorithms as a function of the network's graph, more specifically, of the condition numbers of gossip matrices. We are interested in the case when the network is time-varying but the rate of changes is restricted. We study two cases: randomly changing network satisfying Markov property and a network changing in a deterministic manner. For the random case, we propose a decentralized optimization algorithm with accelerated consensus. For the deterministic scenario, we show that if the graph is changing in a worst-case way, accelerated consensus is not possible even if only two edges are changed at each iteration. The fact that such a low rate of network changes is sufficient to make accelerated consensus impossible is novel and improves the previous results in the literature.
2.Finite-sum optimization: Adaptivity to smoothness and loopless variance reduction
Authors:Bastien Batardière, Julien Chiquet, Joon Kwon
Abstract: For finite-sum optimization, variance-reduced gradient methods (VR) compute at each iteration the gradient of a single function (or of a mini-batch), and yet achieve faster convergence than SGD thanks to a carefully crafted lower-variance stochastic gradient estimator that reuses past gradients. Another important line of research of the past decade in continuous optimization is the adaptive algorithms such as AdaGrad, that dynamically adjust the (possibly coordinate-wise) learning rate to past gradients and thereby adapt to the geometry of the objective function. Variants such as RMSprop and Adam demonstrate outstanding practical performance that have contributed to the success of deep learning. In this work, we present AdaVR, which combines the AdaGrad algorithm with variance-reduced gradient estimators such as SAGA or L-SVRG. We assess that AdaVR inherits both good convergence properties from VR methods and the adaptive nature of AdaGrad: in the case of $L$-smooth convex functions we establish a gradient complexity of $O(n+(L+\sqrt{nL})/\varepsilon)$ without prior knowledge of $L$. Numerical experiments demonstrate the superiority of AdaVR over state-of-the-art methods. Moreover, we empirically show that the RMSprop and Adam algorithm combined with variance-reduced gradients estimators achieve even faster convergence.
3.Simultaneous Optimization of Launch Vehicle Stage and Trajectory Considering Operational Safety Constraints
Authors:Jaeyoul Ko, Jaewoo Kim, Jimin Choi, Jaemyung Ahn
Abstract: A conceptual design of a launch vehicle involves the optimization of trajectory and stages considering its launch operations. This process encompasses various disciplines, such as structural design, aerodynamics, propulsion systems, flight control, and stage sizing. Traditional approaches used for the conceptual design of a launch vehicle conduct the stage and trajectory designs sequentially, often leading to high computational complexity and suboptimal results. This paper presents an optimization framework that addresses both trajectory optimization and staging in an integrated way. The proposed framework aims to maximize the payload-to-liftoff mass ratio while satisfying the constraints required for safe launch operations (e.g., the impact points of burnt stages and fairing). A case study demonstrates the advantage of the proposed framework compared to the traditional sequential optimization approach.
4.Dissipative State and Output Estimation of Systems with General Delays
Authors:Qian Feng, Feng Xiao, Xiaoyu Wang
Abstract: Dissipative state and output estimation for continuous time-delay systems pose a significant challenge when an unlimited number of pointwise and general distributed delays (DDs) are concerned. We propose an effective solution to this open problem using the Krasovski\u{\i} functional (KF) framework in conjunction with a quadratic supply rate function, where both the plant and the estimator can accommodate an unlimited number of pointwise and general DDs. All DDs can contain an unlimited number of square-integrable kernel functions, which are treated by an equivalent decomposition-approximation scheme. This novel approach allows for the factorization or approximation of any kernel function without introducing conservatism, and facilitates the construction of a complete-type KF with integral kernels that can encompass any number of differentiable (weak derivatives) and linearly independent functions. Our proposed solution is expressed as convex semidefinite programs presented in two theorems along with an iterative algorithm, which eliminates the need of nonlinear solvers. We demonstrate the effectiveness of our method using two challenging numerical experiments, including a system stabilized by a non-smooth controller.
5.Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition
Authors:Aleksandr Lobanov, Alexander Gasnikov
Abstract: This paper is devoted to solving a convex stochastic optimization problem in a overparameterization setup for the case where the original gradient computation is not available, but an objective function value can be computed. For this class of problems we provide a novel gradient-free algorithm, whose creation approach is based on applying a gradient approximation with $l_2$ randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracle returns a noisy (inexact) objective function value. We also perform a detailed analysis to find the maximum admissible level of adversarial noise at which we can guarantee to achieve the desired accuracy. We verify the theoretical results of convergence using a model example.
6.Open Problem: Polynomial linearly-convergent method for geodesically convex optimization?
Authors:Christopher Criscitiello, David Martínez-Rubio, Nicolas Boumal
Abstract: Let $f \colon \mathcal{M} \to \mathbb{R}$ be a Lipschitz and geodesically convex function defined on a $d$-dimensional Riemannian manifold $\mathcal{M}$. Does there exist a first-order deterministic algorithm which (a) uses at most $O(\mathrm{poly}(d) \log(\epsilon^{-1}))$ subgradient queries to find a point with target accuracy $\epsilon$, and (b) requires only $O(\mathrm{poly}(d))$ arithmetic operations per query? In convex optimization, the classical ellipsoid method achieves this. After detailing related work, we provide an ellipsoid-like algorithm with query complexity $O(d^2 \log^2(\epsilon^{-1}))$ and per-query complexity $O(d^2)$ for the limited case where $\mathcal{M}$ has constant curvature (hemisphere or hyperbolic space). We then detail possible approaches and corresponding obstacles for designing an ellipsoid-like method for general Riemannian manifolds.
7.Impulsive optimal control problems with time delays in the drift term
Authors:Giovanni Fusco, Monica Motta
Abstract: We introduce a notion of bounded variation solution for a new class of nonlinear control systems with ordinary and impulsive controls, in which the drift function depends not only on the state, but also on its past history, through a finite number of time delays. After proving the well posedness of such solutions and the continuity of the corresponding input output map with respect to suitable topologies, we establish necessary optimality conditions for an associated optimal control problem. The approach, which involves approximating the problem by a non impulsive optimal control problem with time delays and using Ekeland principle combined with a recent, nonsmooth version of the Maximum Principle for conventional delayed systems, allows us to deal with mild regularity assumptions and a general endpoint constraint.
8.Optimal Algorithm with Complexity Separation for Strongly Convex-Strongly Concave Composite Saddle Point Problems
Authors:Ekaterina Borodich, Georgiy Kormakov, Dmitry Kovalev, Aleksandr Beznosikov, Alexander Gasnikov
Abstract: In this work, we focuses on the following saddle point problem $\min_x \max_y p(x) + R(x,y) - q(y)$ where $R(x,y)$ is $L_R$-smooth, $\mu_x$-strongly convex, $\mu_y$-strongly concave and $p(x), q(y)$ are convex and $L_p, L_q$-smooth respectively. We present a new algorithm with optimal overall complexity $\mathcal{O}\left(\left(\sqrt{\frac{L_p}{\mu_x}} + \frac{L_R}{\sqrt{\mu_x \mu_y}} + \sqrt{\frac{L_q}{\mu_y}}\right)\log \frac{1}{\varepsilon}\right)$ and separation of oracle calls in the composite and saddle part. This algorithm requires $\mathcal{O}\left(\left(\sqrt{\frac{L_p}{\mu_x}} + \sqrt{\frac{L_q}{\mu_y}}\right) \log \frac{1}{\varepsilon}\right)$ oracle calls for $\nabla p(x)$ and $\nabla q(y)$ and $\mathcal{O} \left( \max\left\{\sqrt{\frac{L_p}{\mu_x}}, \sqrt{\frac{L_q}{\mu_y}}, \frac{L_R}{\sqrt{\mu_x \mu_y}} \right\}\log \frac{1}{\varepsilon}\right)$ oracle calls for $\nabla R(x,y)$ to find an $\varepsilon$-solution of the problem. To the best of our knowledge, we are the first to develop optimal algorithm with complexity separation in the case $\mu_x \not = \mu_y$. Also, we apply this algorithm to a bilinear saddle point problem and obtain the optimal complexity for this class of problems.
1.Robust stabilization of $2 \times 2$ first-order hyperbolic PDEs with uncertain input delay
Authors:Jing Zhang, Jie Qi
Abstract: A backstepping-based compensator design is developed for a system of $2\times2$ first-order linear hyperbolic partial differential equations (PDE) in the presence of an uncertain long input delay at boundary. We introduce a transport PDE to represent the delayed input, which leads to three coupled first-order hyperbolic PDEs. A novel backstepping transformation, composed of two Volterra transformations and an affine Volterra transformation, is introduced for the predictive control design. The resulting kernel equations from the affine Volterra transformation are two coupled first-order PDEs and each with two boundary conditions, which brings challenges to the well-posedness analysis. We solve the challenge by using the method of characteristics and the successive approximation. To analyze the sensitivity of the closed-loop system to uncertain input delay, we introduce a neutral system which captures the control effect resulted from the delay uncertainty. It is proved that the proposed control is robust to small delay variations. Numerical examples illustrate the performance of the proposed compensator.
2.Second-order optimality conditions for bilevel programs
Authors:Xiang Liu, Mengwei Xu, Liwei Zhang
Abstract: Second-order optimality conditions of the bilevel programming problems are dependent on the second-order directional derivatives of the value functions or the solution mappings of the lower level problems under some regular conditions, which can not be calculated or evaluated. To overcome this difficulty, we propose the notion of the bi-local solution. Under the Jacobian uniqueness conditions for the lower level problem, we prove that the bi-local solution is a local minimizer of some one-level minimization problem. Basing on this property, the first-order necessary optimality conditions and second-order necessary and sufficient optimality conditions for the bi-local optimal solution of a given bilevel program are established. The second-order optimality conditions proposed here only involve second-order derivatives of the defining functions of the bilevel problem. The second-order sufficient optimality conditions are used to derive the Q-linear convergence rate of the classical augmented Lagrangian method.
3.Neural Operators for Delay-Compensating Control of Hyperbolic PIDEs
Authors:Jie Qi, Jing Zhang, Miroslav Krstic
Abstract: The recently introduced DeepONet operator-learning framework for PDE control is extended from the results for basic hyperbolic and parabolic PDEs to an advanced hyperbolic class that involves delays on both the state and the system output or input. The PDE backstepping design produces gain functions that are outputs of a nonlinear operator, mapping functions on a spatial domain into functions on a spatial domain, and where this gain-generating operator's inputs are the PDE's coefficients. The operator is approximated with a DeepONet neural network to a degree of accuracy that is provably arbitrarily tight. Once we produce this approximation-theoretic result in infinite dimension, with it we establish stability in closed loop under feedback that employs approximate gains. In addition to supplying such results under full-state feedback, we also develop DeepONet-approximated observers and output-feedback laws and prove their own stabilizing properties under neural operator approximations. With numerical simulations we illustrate the theoretical results and quantify the numerical effort savings, which are of two orders of magnitude, thanks to replacing the numerical PDE solving with the DeepONet.
4.Note on Steepest Descent Algorithm for Quasi L$^{\natural}$-convex Function Minimization
Authors:Kazuo Murota, Akiyoshi Shioura
Abstract: We define a class of discrete quasi convex functions, called semi-strictly quasi L$^{\natural}$-convex functions, and show that the steepest descent algorithm for L$^{\natural}$-convex function minimization also works for this class of quasi convex functions. The analysis of the exact number of iterations is also extended, revealing the so-called geodesic property of the steepest descent algorithm when applied to semi-strictly quasi L$^{\natural}$-convex functions.
5.Forward Completeness and Applications to Control of Automated Vehicles
Authors:Iasson Karafyllis, Dionysis Theodosis, Markos Papageorgiou
Abstract: Forward complete systems are guaranteed to have solutions that exist globally for all positive time. In this paper, a relaxed Lyapunov-like condition for forward completeness is presented for finite-dimensional systems defined on open sets that does not require boundedness of the Lyapunov-like function along the solutions of the system. The corresponding condition is then exploited for the design of autonomous two-dimensional movement, with focus on lane-free cruise controllers for automated vehicles described by the bicycle kinematic model. The derived feedback laws (cruise controllers) are decentralized and can account for collision avoidance, roads of variable width, on-ramps and off-ramps as well as different desired speed for each vehicle.
6.Further Remarks on the Sampled-Data Feedback Stabilization Problem
Authors:John Tsinias, Dionysis Theodosis
Abstract: The paper deals with the problem of the sampled data feedback stabilization for autonomous nonlinear systems. The corresponding results extend those obtained in earlier works by the same authors. The sufficient conditions we establish are based on the existence of discontinuous control Lyapunov functions and the corresponding results are applicable to a class of nonlinear affine in the control systems.
7.Simultaneous Planning of Liner Ship Speed Optimization, Fleet Deployment, Scheduling and Cargo Allocation with Container Transshipment
Authors:Jasashwi Mandal, Adrijit Goswami, Lakshman Thakur, Manoj Kumar Tiwari
Abstract: Due to a substantial growth in the world waterborne trade volumes and drastic changes in the global climate accounted for CO2 emissions, the shipping companies need to escalate their operational and energy efficiency. Therefore, a multi-objective mixed-integer non-linear programming (MINLP) model is proposed in this study to simultaneously determine the optimal service schedule, number of vessels in a fleet serving each route, vessel speed between two ports of call, and flow of cargo considering transshipment operations for each pair of origin-destination. This MINLP model presents a trade-off between economic and environmental aspects considering total shipping time and overall shipping cost as the two conflicting objectives. The shipping cost comprises of CO2 emission, fuel consumption and several operational costs where fuel consumption is determined using speed and load. Two efficient evolutionary algorithms: Nondominated Sorting Genetic Algorithm II (NSGA-II) and Online Clustering-based Evolutionary Algorithm (OCEA) are applied to attain the near-optimal solution of the proposed problem. Furthermore, six problem instances of different sizes are solved using these algorithms to validate the proposed model.
8.A more efficient reformulation of complex SDP as real SDP
Authors:Jie Wang
Abstract: This note proposes a novel reformulation of complex semidefinite programs (SDPs) as real SDPs by using Lagrange duality. As an application, we present an economical reformulation of complex SDP relaxations of complex polynomial optimization problems as real SDPs and derive some further reductions by exploiting structure of the complex SDP relaxations. Various numerical examples demonstrate that our new reformulation runs several times (one magnitude in some cases) faster than the usual popular reformulation.
9.Vector-borne disease outbreak control via instant releases
Authors:Luis Almeida, Jesús Bellver Arnau, Yannick Privat, Carlota Rebelo
Abstract: This paper is devoted to the study of optimal release strategies to control vector-borne diseases, such as dengue, Zika, chikungunya and malaria. Two techniques are considered: the sterile insect one (SIT), which consists in releasing sterilized males among wild vectors in order to perturb their reproduction, and the Wolbachia one (presently used mainly for mosquitoes), which consists in releasing vectors, that are infected with a bacterium limiting their vector capacity, in order to replace the wild population by one with reduced vector capacity. In each case, the time dynamics of the vector population is modeled by a system of ordinary differential equations in which the releases are represented by linear combinations of Dirac measures with positive coefficients determining their intensity. We introduce optimal control problems that we solve numerically using ad-hoc algorithms, based on writing first-order optimality conditions characterizing the best combination of Dirac measures. We then discuss the results obtained, focusing in particular on the complexity and efficiency of optimal controls and comparing the strategies obtained. Mathematical modeling can help testing a great number of scenarios that are potentially interesting in future interventions (even those that are orthogonal to the present strategies) but that would be hard, costly or even impossible to test in the field in present conditions.
10.About the Blaschke-Santalo diagram of area, perimeter and moment of inertia
Authors:Raphael Gastaldello, Antoine Henrot, Ilaria Lucardesi
Abstract: We study the Blaschke-Santal\'o diagram associated to the area, the perimeter, and the moment of inertia. We work in dimension 2, under two assumptions on the shapes: convexity and the presence of two orthogonal axis of symmetry. We discuss topological and geometrical properties of the diagram. As a by-product we address a conjecture by P\'olya, in the simplified setting of double symmetry.
11.A Sampling-Based Method for Gittins Index Approximation
Authors:Stef Baas, Richard J. Boucherie, Aleida Braaksma
Abstract: A sampling-based method is introduced to approximate the Gittins index for a general family of alternative bandit processes. The approximation consists of a truncation of the optimization horizon and support for the immediate rewards, an optimal stopping value approximation, and a stochastic approximation procedure. Finite-time error bounds are given for the three approximations, leading to a procedure to construct a confidence interval for the Gittins index using a finite number of Monte Carlo samples, as well as an epsilon-optimal policy for the Bayesian multi-armed bandit. Proofs are given for almost sure convergence and convergence in distribution for the sampling based Gittins index approximation. In a numerical study, the approximation quality of the proposed method is verified for the Bernoulli bandit and Gaussian bandit with known variance, and the method is shown to significantly outperform Thompson sampling and the Bayesian Upper Confidence Bound algorithms for a novel random effects multi-armed bandit.
1.A Generalized Pell's equation for a class of multivariate orthogonal polynomials
Authors:Jean-Bernard Lasserre LAAS-POP, Yuan Xu
Abstract: We extend the polynomial Pell's equation satisfied by univariate Chebyshev polynomials on [--1, 1] from one variable to several variables, using orthogonal polynomials on regular domains that include cubes, balls, and simplexes of arbitrary dimension. Moreover, we show that such an equation is strongly connected (i) to a certificate of positivity (from real algebraic geometry) on the domain, as well as (ii) to the Christoffel functions of the equilibrium measure on the domain. In addition, the solution to Pell's equation reflects an extremal property of orthonormal polynomials associated with an entropy-like criterion.
2.Gotta catch 'em all: Modeling All Discrete Alternatives for Industrial Energy System Transitions
Authors:Hendrik Schricker, Benedikt Schuler, Christiane Reinert, Niklas von der Aßen
Abstract: Industrial decision-makers often base decisions on mathematical optimization models to achieve cost-efficient design solutions in energy transitions. However, since a model can only approximate reality, the optimal solution is not necessarily the best real-world energy system. Exploring near-optimal design spaces, e.g., by the Modeling All Alternatives (MAA) method, provides a more holistic view of decision alternatives beyond the cost-optimal solution. However, the MAA method misses out on discrete in-vestment decisions. Incorporating such discrete investment decisions is crucial when modeling industrial energy systems. Our work extends the MAA method by integrating discrete design decisions. We optimize the design and operation of an industrial energy system transformation using a mixed-integer linear program. First, we explore the continuous, near-optimal design space by applying the MAA method. Thereafter, we sample all discrete design alternatives from the continuous, near-optimal design space. In a case study, we apply our method to identify all near-optimal design alternatives of an industrial energy system. We find 128 near-optimal design alternatives where costs are allowed to increase to a maximum of one percent offering decision-makers more flexibility in their investment decisions. Our work enables the analysis of discrete design alternatives for industrial energy transitions and supports the decision-making process for investments in energy infrastructure.
3.A unified observability result for non-autonomous observation problems
Authors:Fabian Gabel, Albrecht Seelmann
Abstract: A final-state observability result in the Banach space setting for non-autonomous observation problems is obtained that covers and extends all previously known results in this context, while providing a streamlined proof that follows the established Lebeau-Robbiano strategy.
4.Quantifying low rank approximations of third order symmetric tensors
Authors:Shenglong Hu, Defeng Sun, Kim-Chuan Toh
Abstract: In this paper, we present a method to certify the approximation quality of a low rank tensor to a given third order symmetric tensor. Under mild assumptions, best low rank approximation is attained if a control parameter is zero or quantified quasi-optimal low rank approximation is obtained if the control parameter is positive.This is based on a primal-dual method for computing a low rank approximation for a given tensor. The certification is derived from the global optimality of the primal and dual problems, and is characterized by easily checkable relations between the primal and the dual solutions together with another rank condition. The theory is verified theoretically for orthogonally decomposable tensors as well as numerically through examples in the general case.
5.Decentralized conditional gradient method over time-varying graphs
Authors:Roman Vedernikov, Alexander Rogozin, Alexander Gasnikov
Abstract: In this paper we study a generalization of distributed conditional gradient method to time-varying network architectures. We theoretically analyze convergence properties of the algorithm and provide numerical experiments. The time-varying network is modeled as a deterministic of a stochastic sequence of graphs.
1.On the Bredies-Chenchene-Lorenz-Naldi algorithm
Authors:Heinz H. Bauschke, Walaa M. Moursi, Shambhavi Singh, Xianfu Wang
Abstract: Monotone inclusion problems occur in many areas of optimization and variational analysis. Splitting methods, which utilize resolvents or proximal mappings of the underlying operators, are often applied to solve these problems. In 2022, Bredies, Chenchene, Lorenz, and Naldi introduced a new elegant algorithmic framework that encompasses various well known algorithms including Douglas-Rachford and Chambolle-Pock. They obtained powerful weak and strong convergence results, where the latter type relies on additional strong monotonicity assumptions. In this paper, we complement the analysis by Bredies et al. by relating the projections of the fixed point sets of the underlying operators that generate the (reduced and original) preconditioned proximal point sequences. We also obtain strong convergence results in the case of linear relations. Various examples are provided to illustrate the applicability of our results.
2.Stopping Rules for Gradient Method for Saddle Point Problems with Twoside Polyak-Lojasievich Condition
Authors:Muratidi A. Ya., Stonyakin F. S
Abstract: The paper considers approaches to saddle point problems with a two-sided variant of the Polyak-Lojasievich condition based on the gradient method with inexact information and proposes a stopping rule based on the smallness of the norm of the inexact gradient of the external subproblem. Achieving this rule in combination with a suitable accuracy of solving the auxiliary subproblem ensures that the quality of the original saddle point problem is acceptable. The results of numerical experiments for various saddle point problems are discussed to illustrate the effectiveness of the proposed method, including the comparison with proven convergence rate estimates.
3.Information Structures in AC/DC Grids
Authors:Josh A. Taylor
Abstract: The converters in an AC/DC grid form actuated boundaries between the AC and DC subgrids. We show how in both simple linear and balanced dq-frame models, the states on either side of these boundaries are coupled only by control inputs. This topological property imparts all AC/DC grids with poset-causal information structures. A practical benefit is that certain decentralized control problems that are hard in general are tractable for poset-causal systems. We also show that special cases like multi-terminal DC grids can have coordinated and leader-follower information structures.
4.Inexact Direct-Search Methods for Bilevel Optimization Problems
Authors:Youssef Diouane, Vyacheslav Kungurtsev, Francesco Rinaldi, Damiano Zeffiro
Abstract: In this work, we introduce new direct search schemes for the solution of bilevel optimization (BO) problems. Our methods rely on a fixed accuracy black box oracle for the lower-level problem, and deal both with smooth and potentially nonsmooth true objectives. We thus analyze for the first time in the literature direct search schemes in these settings, giving convergence guarantees to approximate stationary points, as well as complexity bounds in the smooth case. We also propose the first adaptation of mesh adaptive direct search schemes for BO. Some preliminary numerical results on a standard set of bilevel optimization problems show the effectiveness of our new approaches.
5.A non-monotone extra-gradient trust-region method with noisy oracles
Authors:Natasa Krejic, Natasa Krklec Jerinkic, Angeles Martinez, Mahsa Yousefi
Abstract: In this work, we introduce a novel stochastic second-order method, within the framework of a non-monotone trust-region approach, for solving the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The proposed algorithm makes use of subsampling strategies which yield noisy approximations of the finite sum objective function and its gradient. To effectively control the resulting approximation error, we introduce an adaptive sample size strategy based on inexpensive additional sampling. Depending on the estimated progress of the algorithm, this can yield sample size scenarios ranging from mini-batch to full sample functions. We provide convergence analysis for all possible scenarios and show that the proposed method achieves almost sure convergence under standard assumptions for the trust-region framework. We report numerical experiments showing that the proposed algorithm outperforms its state-of-the-art counterpart in deep neural network training for image classification and regression tasks while requiring a significantly smaller number of gradient evaluations.
6.Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization
Authors:Nachuan Xiao, Xiaoyin Hu, Kim-Chuan Toh
Abstract: In this paper, we investigate the convergence properties of the stochastic gradient descent (SGD) method and its variants, especially in training neural networks built from nonsmooth activation functions. We develop a novel framework that assigns different timescales to stepsizes for updating the momentum terms and variables, respectively. Under mild conditions, we prove the global convergence of our proposed framework in both single-timescale and two-timescale cases. We show that our proposed framework encompasses a wide range of well-known SGD-type methods, including heavy-ball SGD, SignSGD, Lion, normalized SGD and clipped SGD. Furthermore, when the objective function adopts a finite-sum formulation, we prove the convergence properties for these SGD-type methods based on our proposed framework. In particular, we prove that these SGD-type methods find the Clarke stationary points of the objective function with randomly chosen stepsizes and initial points under mild assumptions. Preliminary numerical experiments demonstrate the high efficiency of our analyzed SGD-type methods.
7.An Operator-Splitting Approach for Variational Optimal Control Formulations for Diffeomorphic Shape Matching
Authors:Andreas Mang, Jiwen He, Robert Azencott
Abstract: We present formulations and numerical algorithms for solving diffeomorphic shape matching problems. We formulate shape matching as a variational problem governed by a dynamical system that models the flow of diffeomorphism $f_t \in \operatorname{diff}(\mathbb{R}^3)$. We overview our contributions in this area, and present an improved, matrix-free implementation of an operator splitting strategy for diffeomorphic shape matching. We showcase results for diffeomorphic shape matching of real clinical cardiac data in $\mathbb{R}^3$ to assess the performance of our methodology.
1.Solution of the Optimal Control Problem for the Cahn-Hilliard Equation Using Finite Difference Approximation
Authors:Gobinda Garai, Bankim C. Mandal
Abstract: This paper is concerned with the designing, analyzing and implementing linear and nonlinear discretization scheme for the distributed optimal control problem (OCP) with the Cahn-Hilliard (CH) equation as constrained. We propose three difference schemes to approximate and investigate the solution behaviour of the OCP for the CH equation. We present the convergence analysis of the proposed discretization. We verify our findings by presenting numerical experiments.
2.Harnessing the mathematics of matrix decomposition to solve planted and maximum clique problem
Authors:Salma Omer, Montaz Ali
Abstract: We consider the problem of identifying a maximum clique in a given graph. We have proposed a mathematical model for this problem. The model resembles the matrix decomposition of the adjacency matrix of a given graph. The objective function of the mathematical model includes a weighted $\ell_{1}$-norm of the sparse matrix of the decomposition, which has an advantage over the known $\ell_{1}-$norm in reducing the error. The use of dynamically changing the weights for the $\ell_{1}$-norm has been motivated. We have used proximal operators within the iterates of the ADMM (alternating direction method of multipliers) algorithm to solve the optimization problem. Convergence of the proposed ADMM algorithm has been provided. The theoretical guarantee of the maximum clique in the form of the low-rank matrix has also been established using the golfing scheme to construct approximate dual certificates. We have constructed conditions that guarantee the recovery and uniqueness of the solution, as well as a tight bound on the dual matrix that validates optimality conditions. Numerical results for planted cliques are presented showing clear advantages of our model when compared with two recent mathematical models. Results are also presented for randomly generated graphs with minimal errors. These errors are found using a formula we have proposed based on the size of the clique. Moreover, we have applied our algorithm to real-world graphs for which cliques have been recovered successfully. The validity of these clique sizes comes from the decomposition of input graph into a rank-one matrix (corresponds to the clique) and a sparse matrix.
3.Globally solving the Gromov-Wasserstein problem for point clouds in low dimensional Euclidean spaces
Authors:Martin Ryner, Jan Kronqvist, Johan Karlsson
Abstract: This paper presents a framework for computing the Gromov-Wasserstein problem between two sets of points in low dimensional spaces, where the discrepancy is the squared Euclidean norm. The Gromov-Wasserstein problem is a generalization of the optimal transport problem that finds the assignment between two sets preserving pairwise distances as much as possible. This can be used to quantify the similarity between two formations or shapes, a common problem in AI and machine learning. The problem can be formulated as a Quadratic Assignment Problem (QAP), which is in general computationally intractable even for small problems. Our framework addresses this challenge by reformulating the QAP as an optimization problem with a low-dimensional domain, leveraging the fact that the problem can be expressed as a concave quadratic optimization problem with low rank. The method scales well with the number of points, and it can be used to find the global solution for large-scale problems with thousands of points. We compare the computational complexity of our approach with state-of-the-art methods on synthetic problems and apply it to a near-symmetrical problem which is of particular interest in computational biology.
4.Decentralized Stochastic Linear-Quadratic Optimal Control with Risk Constraint and Partial Observation
Authors:Jia Hui, Yuan-Hua Ni
Abstract: This paper addresses a risk-constrained decentralized stochastic linear-quadratic optimal control problem with one remote controller and one local controller, where the risk constraint is posed on the cumulative state weighted variance in order to reduce the oscillation of system trajectory. In this model, local controller can only partially observe the system state, and sends the estimate of state to remote controller through an unreliable channel, whereas the channel from remote controller to local controllers is perfect. For the considered constrained optimization problem, we first punish the risk constraint into cost function through Lagrange multiplier method, and the resulting augmented cost function will include a quadratic mean-field term of state. In the sequel, for any but fixed multiplier, explicit solutions to finite-horizon and infinite-horizon mean-field decentralized linear-quadratic problems are derived together with necessary and sufficient condition on the mean-square stability of optimal system. Then, approach to find the optimal Lagrange multiplier is presented based on bisection method. Finally, two numerical examples are given to show the efficiency of the obtained results.
5.A Sweeping Process Control Problem Subject To Mixed Constraints
Authors:Karla L. Cortez, Nathalie T. Khalil, Julio E. Solis
Abstract: In this study, we investigate optimal control problems that involve sweeping processes with a drift term and mixed inequality constraints. Our goal is to establish necessary optimality conditions for these problems. We address the challenges that arise due to the combination of sweeping processes and inequality mixed constraints in two contexts: regular and non-regular. This requires working with different types of multipliers, such as finite positive Radon measures for the sweeping term and integrable functions for regular mixed constraints. For non-regular mixed constraints, the multipliers correspond to purely finitely additive set functions.
6.BOP-Elites, a Bayesian Optimisation Approach to Quality Diversity Search with Black-Box descriptor functions
Authors:Paul Kent, Adam Gaier, Jean-Baptiste Mouret, Juergen Branke
Abstract: Quality Diversity (QD) algorithms such as MAP-Elites are a class of optimisation techniques that attempt to find many high performing points that all behave differently according to a user-defined behavioural metric. In this paper we propose the Bayesian Optimisation of Elites (BOP-Elites) algorithm. Designed for problems with expensive black-box fitness and behaviour functions, it is able to return a QD solution-set with excellent final performance already after a relatively small number of samples. BOP-Elites models both fitness and behavioural descriptors with Gaussian Process (GP) surrogate models and uses Bayesian Optimisation (BO) strategies for choosing points to evaluate in order to solve the quality-diversity problem. In addition, BOP-Elites produces high quality surrogate models which can be used after convergence to predict solutions with any behaviour in a continuous range. An empirical comparison shows that BOP-Elites significantly outperforms other state-of-the-art algorithms without the need for problem-specific parameter tuning.
7.Disturbance decoupled functional observers for fault estimation in nonlinear systems
Authors:Sunjeev Venkateswaran, Costas Kravaris
Abstract: This work deals with the problem of designing disturbance decupled observers for the estimation of a function of the states in nonlinear systems. Necessary and sufficient conditions for the existence of lower order disturbance decoupled functional observers with linear dynamics and linear output map are derived. Based on this methodology, a fault-estimation scheme based on disturbance decoupled observers will be presented. Throughout the paper, the application of the results will be illustrated through a chemical reactor case study
8.Grid-Forming Hybrid Angle Control: Behavior, Stability, Variants and Verification
Authors:Ali Tayyebi, Denis Vettoretti, Adolfo Anta, Florian Dörfler
Abstract: This work explores the stability, behavior, variants, and a controller-hardware-in-the-loop (C-HiL) verification of the recently proposed grid-forming (GFM) hybrid angle control (HAC). We revisit the foundation of GFM HAC, and highlight its behavioral properties in relation to the conventional synchronous machine (SM). Next, we introduce the required complementary controls to be combined with the HAC to realize a GFM behavior. The characterization of the analytical operating point and nonlinear energy-based stability analysis of a grid-connected converter under the HAC is presented. Further, we consider various output filter configurations and derive an approximation for the original control proposal. Moreover, we provide details on the integration of GFM HAC into a complex converter control architecture and introduce several variants of the standard HAC. Finally, the performance of GFM HAC is verified by several test scenarios in a C-HiL setup to test its behavior against real-world effect such as noise and delays.
9.Jointly Improving the Sample and Communication Complexities in Decentralized Stochastic Minimax Optimization
Authors:Xuan Zhang, Gabriel Mancino-Ball, Necdet Serhat Aybat, Yangyang Xu
Abstract: We propose a novel single-loop decentralized algorithm called DGDA-VR for solving the stochastic nonconvex strongly-concave minimax problem over a connected network of $M$ agents. By using stochastic first-order oracles to estimate the local gradients, we prove that our algorithm finds an $\epsilon$-accurate solution with $\mathcal{O}(\epsilon^{-3})$ sample complexity and $\mathcal{O}(\epsilon^{-2})$ communication complexity, both of which are optimal and match the lower bounds for this class of problems. Unlike competitors, our algorithm does not require multiple communications for the convergence results to hold, making it applicable to a broader computational environment setting. To the best of our knowledge, this is the first such algorithm to jointly optimize the sample and communication complexities for the problem considered here.
1.Convex Bi-Level Optimization Problems with Non-smooth Outer Objective Function
Authors:Roey Merchav, Shoham Sabach
Abstract: In this paper, we propose the Bi-Sub-Gradient (Bi-SG) method, which is a generalization of the classical sub-gradient method to the setting of convex bi-level optimization problems. This is a first-order method that is very easy to implement in the sense that it requires only a computation of the associated proximal mapping or a sub-gradient of the outer non-smooth objective function, in addition to a proximal gradient step on the inner optimization problem. We show, under very mild assumptions, that Bi-SG tackles bi-level optimization problems and achieves sub-linear rates both in terms of the inner and outer objective functions. Moreover, if the outer objective function is additionally strongly convex (still could be non-smooth), the outer rate can be improved to a linear rate. Last, we prove that the distance of the generated sequence to the set of optimal solutions of the bi-level problem converges to zero.
2.Global convergence of a BFGS-type algorithm for nonconvex multiobjective optimization problems
Authors:L. F. Prudente, D. R. Souza
Abstract: We propose a modified BFGS algorithm for multiobjective optimization problems with global convergence, even in the absence of convexity assumptions on the objective functions. Furthermore, we establish the superlinear convergence of the method under usual conditions. Our approach employs Wolfe step sizes and ensures that the Hessian approximations are updated and corrected at each iteration to address the lack of convexity assumption. Numerical results shows that the introduced modifications preserve the practical efficiency of the BFGS method.
3.Robust Combinatorial Optimization Problems Under Budgeted Interdiction Uncertainty
Authors:Marc Goerigk, Mohammad Khosravi
Abstract: In robust combinatorial optimization, we would like to find a solution that performs well under all realizations of an uncertainty set of possible parameter values. How we model this uncertainty set has a decisive influence on the complexity of the corresponding robust problem. For this reason, budgeted uncertainty sets are often studied, as they enable us to decompose the robust problem into easier subproblems. We propose a variant of discrete budgeted uncertainty for cardinality-based constraints or objectives, where a weight vector is applied to the budget constraint. We show that while the adversarial problem can be solved in linear time, the robust problem becomes NP-hard and not approximable. We discuss different possibilities to model the robust problem and show experimentally that despite the hardness result, some models scale relatively well in the problem size.
1.Conic cancellation laws and some applications
Authors:Marius Durea, Elena-Andreea Florea
Abstract: We discuss, on finite and infinite dimensional normed vector spaces, some versions of Radstr\"{o}m cancellation law (or lemma) that are suited for applications to set optimization problems. In this sense, we call our results "conic" variants of the celebrated result of Radstr\"{o}m, since they involve the presence of an ordering cone on the underlying space. Several adaptations to this context of some topological properties of sets are studied and some applications to subdifferential calculus associated to set-valued maps and to necessary optimality conditions for constrained set optimization problems are given. Finally, a stability problem is considered.
2.Stable domains for higher order elliptic operators
Authors:Jean-François Grosjean, Antoine Lemenant, Rémy Mougenot
Abstract: This paper is devoted to prove that any domain satisfying a $(\delta_0,r_0)-$capacity condition of first order is automatically $(m,p)-$stable for all $m\geqslant 1$ and $p\geqslant 1$, and for any dimension $N\geqslant 1$. In particular, this includes regular enough domains such as $\mathscr{C}^1-$domains, Lipchitz domains, Reifenberg flat domains, but is weak enough to also includes cusp points. Our result extends some of the results of Hayouni and Pierre valid only for $N=2,3$, and extends also the results of Bucur and Zolesio for higher order operators, with a different and simpler proof.
3.Stability analysis of the Navier-Stokes velocity tracking problem with bang-bang controls
Authors:Alberto Domínguez Corella, Nicolai Jork, Šarká Nečasová, John Sebastian H. Simon
Abstract: This paper focuses on the stability of solutions for a velocity-tracking problem associated with the two-dimensional Navier-Stokes equations. The considered optimal control problem does not possess any regularizer in the cost, and hence bang-bang solutions can be expected. We investigate perturbations that account for uncertainty in the tracking data and the initial condition of the state, and analyze the convergence rate of solutions when the original problem is regularized by the Tikhonov term. The stability analysis relies on the H\"older subregularity of the optimality mapping, which stems from the necessary conditions of the problem.
4.Projection onto a Capped Rotated Second-Order Cone with Applications to Sparse Regression Relaxations
Authors:Noam Goldberg, Ishy Zagdoun
Abstract: This paper establishes a closed-form expression for projecting onto a capped rotated second-order cone. This special object is a convex set that arises as a part of the feasible region of the perspective relaxation of mixed-integer nonlinear programs (MINLP) with binary indicator variables. The rapid computation of the projection onto this convex set enables the development of effective methods for solving the continuous relaxation of MINLPs whose feasible region may involve a Cartesian product of a large number of such sets. As a proof of concept for the applicability of our projection method, we develop a projected gradient method and specialize a general form of FISTA to use our projection technique in order to effectively solve the continuous perspective relaxation of a sparse regression problem with $L_0$ and $L_2$ penalties. We also generalize the basic sparse regression formulation and solution method to support group sparsity. In experiments we first demonstrate that the projection problem is solved faster and more accurately with our closed-form than with an interior-point solver, and also when solving sparse regression problems our methods that applies our projection formula can outperform a state-of-the-art interior point solver while nearly matching its solution accuracy.
5.A Unified Distributed Method for Constrained Networked Optimization via Saddle-Point Dynamics
Authors:Yi Huang, Ziyang Meng, Jian Sun, Wei Ren
Abstract: This paper develops a unified distributed method for solving two classes of constrained networked optimization problems, i.e., optimal consensus problem and resource allocation problem with non-identical set constraints. We first transform these two constrained networked optimization problems into a unified saddle-point problem framework with set constraints. Subsequently, two projection-based primal-dual algorithms via Optimistic Gradient Descent Ascent (OGDA) method and Extra-gradient (EG) method are developed for solving constrained saddle-point problems. It is shown that the developed algorithms achieve exact convergence to a saddle point with an ergodic convergence rate $O(1/k)$ for general convex-concave functions. Based on the proposed primal-dual algorithms via saddle-point dynamics, we develop unified distributed algorithm design and convergence analysis for these two networked optimization problems. Finally, two numerical examples are presented to demonstrate the theoretical results.
6.A Context-Aware Cutting Plane Selection Algorithm for Mixed-Integer Programming
Authors:Mark Turner, Timo Berthold, Mathieu Besançon
Abstract: The current cut selection algorithm used in mixed-integer programming solvers has remained largely unchanged since its creation. In this paper, we propose a set of new cut scoring measures, cut filtering techniques, and stopping criteria, extending the current state-of-the-art algorithm and obtaining a 4\% performance improvement for SCIP over the MIPLIB 2017 benchmark set.
7.Strict pseudocontractions and demicontractions, their properties and applications
Authors:Andrzej Cegielski
Abstract: We give properties of strict pseudocontractions and demicontractions defined on a Hilbert space, which constitute wide classes of operators that arise in iterative methods for solving fixed point problems. In particular, we give necessary and sufficient conditions under which a convex combination and composition of strict pseudocontractions as well as demicontractions that share a common fixed point is again a strict pseudocontraction or a demicontraction, respectively. Moreover, we introduce a generalized relaxation of composition of demicontraction and give its properties. We apply these properties to prove the weak convergence of a class of algorithms that is wider than the Douglas-Rachford algorithm and projected Landweber algorithms. We have also presented two numerical examples, where we compare the behavior of the presented methods with the Douglas-Rachford method.
8.Inverse Optimization for Routing Problems
Authors:Pedro Zattoni Scroccaro, Piet van Beek, Peyman Mohajerin Esfahani, Bilge Atasoy
Abstract: We propose a method for learning decision-makers' behavior in routing problems using Inverse Optimization (IO). The IO framework falls into the supervised learning category and builds on the premise that the target behavior is an optimizer of an unknown cost function. This cost function is to be learned through historical data, and in the context of routing problems, can be interpreted as the routing preferences of the decision-makers. In this view, the main contributions of this study are to propose an IO methodology with a hypothesis function, loss function, and stochastic first-order algorithm tailored to routing problems. We further test our IO approach in the Amazon Last Mile Routing Research Challenge, where the goal is to learn models that replicate the routing preferences of human drivers, using thousands of real-world routing examples. Our final IO-learned routing model achieves a score that ranks 2nd compared with the 48 models that qualified for the final round of the challenge. Our results showcase the flexibility and real-world potential of the proposed IO methodology to learn from decision-makers' decisions in routing problems.
1.Efficient KKT reformulations for bilevel linear programming
Authors:Christoph Buchheim
Abstract: It is a well-known result that bilevel linear programming is NP-hard. In many publications, reformulations as mixed-integer linear programs are proposed, which suggests that the decision version of the problem belongs to NP. However, to the best of our knowledge, a rigorous proof of membership in NP has never been published, so we close this gap by reporting a simple but not entirely trivial proof. A related question is whether a large enough "big M" for the classical KKT-based reformulation can be computed efficiently, which we answer in the affirmative. In particular, our big M has polynomial encoding length in the original problem data.
2.Weighted tardiness minimization for unrelated machines with sequence-dependent and resource-constrained setups
Authors:Ioannis Avgerinos, Ioannis Mourtos, Stavros Vatikiotis, Georgios Zois
Abstract: Motivated by the need of quick job (re-)scheduling, we examine an elaborate scheduling environment under the objective of total weighted tardiness minimization. The examined problem variant moves well beyond existing literature, as it considers unrelated machines, sequence-dependent and machine-dependent setup times and a renewable resource constraint on the number of simultaneous setups. For this variant, we provide a relaxed MILP to calculate lower bounds, thus estimating a worst-case optimality gap. As a fast exact approach appears not plausible for instances of practical importance, we extend known (meta-)heuristics to deal with the problem at hand, coupling them with a Constraint Programming (CP) component - vital to guarantee the non-violation of the problem's constraints - which optimally allocates resources with respect to tardiness minimization. The validity and versatility of employing different (meta-)heuristics exploiting a relaxed MILP as a quality measure is revealed by our extensive experimental study, which shows that the methods deployed have complementary strengths depending on the instance parameters. Since the problem description has been obtained from a textile manufacturer where jobs of diverse size arrive continuously under tight deadlines, we also discuss the practical impact of our approach in terms of both tardiness decrease and broader managerial insights.
3.Hypergraph-Based Fast Distributed AC Power Flow Optimization
Authors:Xinliang Dai, Yingzhao Lian, Yuning Jiang, Colin N. Jones, Veit Hagenmeyer
Abstract: This paper presents a novel distributed approach for solving AC power flow (PF) problems. The optimization problem is reformulated into a distributed form using a communication structure corresponding to a hypergraph, by which complex relationships between subgrids can be expressed as hyperedges. Then, a hypergraph-based distributed sequential quadratic programming (HDQ) approach is proposed to handle the reformulated problems, and the hypergraph-based distributed sequential quadratic programming (HDSQP) is used as the inner algorithm to solve the corresponding QP subproblems, which are respectively condensed using Schur complements with respect to coupling variables defined by hyperedges. Furthermore, we rigorously establish the convergence guarantee of the proposed algorithm with a locally quadratic rate and the one-step convergence of the inner algorithm when using the Levenberg-Marquardt regularization. Our analysis also demonstrates that the computational complexity of the proposed algorithm is much lower than the state-of-art distributed algorithm. We implement the proposed algorithm in an open-source toolbox, i.e., rapidPF, and conduct numerical tests that validate the proof and demonstrate the great potential of the proposed distributed algorithm in terms of communication effort and computational speed.
4.Linear programming sensitivity measured by the optimal value worst-case analysis
Authors:Milan Hladík
Abstract: This paper introduces a concept of a derivative of the optimal value function in linear programming (LP). Basically, it is the the worst case optimal value of an interval LP problem when the nominal data the data are inflated to intervals according to given perturbation patterns. By definition, the derivative expresses how the optimal value can worsen when the data are subject to variation. In addition, it also gives a certain sensitivity measure or condition number of an LP problem. If the LP problem is nondegenerate, the derivatives are easy to calculate from the computed primal and dual optimal solutions. For degenerate problems, the computation is more difficult. We propose an upper bound and some kind of characterization, but there are many open problems remaining. We carried out numerical experiments with specific LP problems and with real LP data from Netlib repository. They show that the derivatives give a suitable sensitivity measure of LP problems. It remains an open problem how to efficiently and rigorously handle degenerate problems.
5.Sharpness and well-conditioning of nonsmooth convex formulations in statistical signal recovery
Authors:Lijun Ding, Alex L. Wang
Abstract: We study a sample complexity vs. conditioning tradeoff in modern signal recovery problems where convex optimization problems are built from sampled observations. We begin by introducing a set of condition numbers related to sharpness in $\ell_p$ or Schatten-p norms ($p\in[1,2]$) based on nonsmooth reformulations of a class of convex optimization problems, including sparse recovery, low-rank matrix sensing, covariance estimation, and (abstract) phase retrieval. In each of the recovery tasks, we show that the condition numbers become dimension independent constants once the sample size exceeds some constant multiple of the recovery threshold. Structurally, this result ensures that the inaccuracy in the recovered signal due to both observation noise and optimization error is well-controlled. Algorithmically, such a result ensures that a new first-order method for solving the class of sharp convex functions in a given $\ell_p$ or Schatten-p norm, when applied to the nonsmooth formulations, achieves nearly-dimension-independent linear convergence.
1.Outlier detection in regression: conic quadratic formulations
Authors:Andrés Gómez, José Neto
Abstract: In many applications, when building linear regression models, it is important to account for the presence of outliers, i.e., corrupted input data points. Such problems can be formulated as mixed-integer optimization problems involving cubic terms, each given by the product of a binary variable and a quadratic term of the continuous variables. Existing approaches in the literature, typically relying on the linearization of the cubic terms using big-M constraints, suffer from weak relaxation and poor performance in practice. In this work we derive stronger second-order conic relaxations that do not involve big-M constraints. Our computational experiments indicate that the proposed formulations are several orders-of-magnitude faster than existing big-M formulations in the literature for this problem.
2.Online Inventory Problems: Beyond the i.i.d. Setting with Online Convex Optimization
Authors:Massil Hihat, Stéphane Gaïffas, Guillaume Garrigos, Simon Bussy
Abstract: We study multi-product inventory control problems where a manager makes sequential replenishment decisions based on partial historical information in order to minimize its cumulative losses. Our motivation is to consider general demands, losses and dynamics to go beyond standard models which usually rely on newsvendor-type losses, fixed dynamics, and unrealistic i.i.d. demand assumptions. We propose MaxCOSD, an online algorithm that has provable guarantees even for problems with non-i.i.d. demands and stateful dynamics, including for instance perishability. We consider what we call non-degeneracy assumptions on the demand process, and argue that they are necessary to allow learning.
3.On the sharp Makai inequality
Authors:Francesca Prinari, Anna Chiara Zagati
Abstract: On a convex bounded open set, we prove that Poincar\'e-Sobolev constants for functions vanishing at the boundary can be bounded from below in terms of the norm of the distance function in a suitable Lebesgue space. This generalizes a result shown, in the planar case, by E. Makai, for the torsional rigidity. In addition, we compare the sharp Makai constants obtained in the class of convex sets with the optimal constants defined in other classes of open sets. Finally, an alternative proof of the Hersch-Protter inequality for convex sets is given.
4.Integrated supervisory control and fixed path speed trajectory generation for hybrid electric ships via convex optimization
Authors:Antti Ritari, Niklas Katzenburg, Fabricio Oliveira, Kari Tammi
Abstract: Battery-hybrid power source architectures can reduce fuel consumption and emissions for ships with diverse operation profiles. However, conventional control strategies may fail to improve performance if the future operation profile is unknown to the controller. This paper proposes a guidance, navigation, and control (GNC) function that integrates trajectory generation and hybrid power source supervisory control. We focus on time and fuel optimal path-constrained trajectory planning. This problem is a nonlinear and nonconvex optimal control problem, which means that it is not readily amenable to efficient and reliable solution onboard. We propose a nonlinear change of variables and constraint relaxations that transform the nonconvex planning problem into a convex optimal control problem. The nonconvex three-degree-of-freedom dynamics, hydrodynamic forces, fixed pitch propeller, battery, and general energy converter (e.g., fuel cell or generating set) dissipation constraints are expressed in convex functional form. A condition derived from Pontryagin's Minimum Principle guarantees that, when satisfied, the solution of the relaxed problem provides the solution to the original problem. The validity and effectiveness of this approach are numerically illustrated for a battery-hybrid vessel in model scale. First, the convex hydrodynamic hull and rudder force models are validated with towing tank test data. Second, optimal trajectories and supervisory control schemes are evaluated under varying mission requirements. The convexification scheme in this work lays the path for the employment of mature, computationally robust convex optimization methods and creates a novel possibility for real-time optimization onboard future smart and unmanned surface vehicles.
5.A preliminary model for optimal control of moisture content in unsaturated soils
Authors:Marco Berardi, Fabio V. Difonzo, Roberto Guglielmi
Abstract: In this paper we introduce an optimal control approach to Richards' equation in an irrigation framework, aimed at minimizing water consumption while maximizing root water uptake. We first describe the physics of the nonlinear model under consideration, and then develop the first-order necessary optimality conditions of the associated boundary control problem. We show that our model provides a promising framework to support optimized irrigation strategies, thus facing water scarcity in irrigation. The characterization of the optimal control in terms of a suitable relation with the adjoint state of the optimality conditions is then used to develop numerical simulations on different hydrological settings, that supports the analytical findings of the paper.
6.Further techniques on a polynomial positivity question of Collins, Dykema, and Torres-Ayala
Authors:Nathaniel K. Green, Edward D. Kim
Abstract: We prove that the coefficient of $t^2$ in $\mathsf{trace}((A+tB)^6)$ is a sum of squares in the entries of the symmetric matrices $A$ and $B$.
7.Provably Faster Gradient Descent via Long Steps
Authors:Benjamin Grimmer
Abstract: This work establishes provably faster convergence rates for gradient descent via a computer-assisted analysis technique. Our theory allows nonconstant stepsize policies with frequent long steps potentially violating descent by analyzing the overall effect of many iterations at once rather than the typical one-iteration inductions used in most first-order method analyses. We show that long steps, which may increase the objective value in the short term, lead to provably faster convergence in the long term. A conjecture towards proving a faster $O(1/T\log T)$ rate for gradient descent is also motivated along with simple numerical validation.
1.Control and estimation of multi-commodity network flow under aggregation
Authors:Yongxin Chen, Tryphon T. Georgiou, Michele Pavon
Abstract: A paradigm put forth by E. Schr\"odinger in 1931/32, known as Schr\"odinger bridges, represents a formalism to pose and solve control and estimation problems seeking a perturbation from an initial control schedule (in the case of control), or from a prior probability law (in the case of estimation), sufficient to reconcile data in the form of marginal distributions and minimal in the sense of relative entropy to the prior. In the same spirit, we consider traffic-flow and apply a Schr\"odinger-type dictum, to perturb minimally with respect to a suitable relative entropy functional a prior schedule/law so as to reconcile the traffic flow with scarce aggregate distributions on families of indistinguishable individuals. Specifically, we consider the problem to regulate/estimate multi-commodity network flow rates based only on empirical distributions of commodities being transported (e.g., types of vehicles through a network, in motion) at two given times. Thus, building on Schr\"odinger's large deviation rationale, we develop a method to identify {\em the most likely flow rates (traffic flow)}, given prior information and aggregate observations. Our method further extends the Schr\"odinger bridge formalism to the multi-commodity setting, allowing commodities to exit or enter the flow field as well (e.g., vehicles to enter and stop and park) at any time. The behavior of entering or exiting the flow field, by commodities or vehicles, is modeled by a Markov chains with killing and creation states. Our method is illustrated with a numerical experiment.
2.Linearization via Ordering Variables in Binary Optimization for Ising Machines
Authors:Kentaro Ohno, Nozomu Togawa
Abstract: Ising machines are next-generation computers expected for efficiently sampling near-optimal solutions of combinatorial oprimization problems. Combinatorial optimization problems are modeled as quadratic unconstrained binary optimization (QUBO) problems to apply an Ising machine. However, current state-of-the-art Ising machines still often fail to output near-optimal solutions due to the complicated energy landscape of QUBO problems. Furthermore, physical implementation of Ising machines severely restricts the size of QUBO problems to be input as a result of limited hardware graph structures. In this study, we take a new approach to these challenges by injecting auxiliary penalties preserving the optimum, which reduces quadratic terms in QUBO objective functions. The process simultaneously simplifies the energy landscape of QUBO problems, allowing search for near-optimal solutions, and makes QUBO problems sparser, facilitating encoding into Ising machines with restriction on the hardware graph structure. We propose linearization via ordering variables of QUBO problems as an outcome of the approach. By applying the proposed method to synthetic QUBO instances and to multi-dimensional knapsack problems, we empirically validate the effects on enhancing minor embedding of QUBO problems and performance of Ising machines.
3.A regularized Interior Point Method for sparse Optimal Transport on Graphs
Authors:Stefano Cipolla, Jacek Gondzio, Filippo Zanetti
Abstract: In this work, the authors address the Optimal Transport (OT) problem on graphs using a proximal stabilized Interior Point Method (IPM). In particular, strongly leveraging on the induced primal-dual regularization, the authors propose to solve large scale OT problems on sparse graphs using a bespoke IPM algorithm able to suitably exploit primal-dual regularization in order to enforce scalability. Indeed, the authors prove that the introduction of the regularization allows to use sparsified versions of the normal Newton equations to inexpensively generate IPM search directions. A detailed theoretical analysis is carried out showing the polynomial convergence of the inner algorithm in the proposed computational framework. Moreover, the presented numerical results showcase the efficiency and robustness of the proposed approach when compared to network simplex solvers.
4.A stochastic two-step inertial Bregman proximal alternating linearized minimization algorithm for nonconvex and nonsmooth problems
Authors:Chenzheng Guo, Jing Zhao, Qiao-Li Dong
Abstract: In this paper, for solving a broad class of large-scale nonconvex and nonsmooth optimization problems, we propose a stochastic two step inertial Bregman proximal alternating linearized minimization (STiBPALM) algorithm with variance-reduced stochastic gradient estimators. And we show that SAGA and SARAH are variance-reduced gradient estimators. Under expectation conditions with the Kurdyka-Lojasiewicz property and some suitable conditions on the parameters, we obtain that the sequence generated by the proposed algorithm converges to a critical point. And the general convergence rate is also provided. Numerical experiments on sparse nonnegative matrix factorization and blind image-deblurring are presented to demonstrate the performance of the proposed algorithm.
5.Stochastic Nested Compositional Bi-level Optimization for Robust Feature Learning
Authors:Xuxing Chen, Krishnakumar Balasubramanian, Saeed Ghadimi
Abstract: We develop and analyze stochastic approximation algorithms for solving nested compositional bi-level optimization problems. These problems involve a nested composition of $T$ potentially non-convex smooth functions in the upper-level, and a smooth and strongly convex function in the lower-level. Our proposed algorithm does not rely on matrix inversions or mini-batches and can achieve an $\epsilon$-stationary solution with an oracle complexity of approximately $\tilde{O}_T(1/\epsilon^{2})$, assuming the availability of stochastic first-order oracles for the individual functions in the composition and the lower-level, which are unbiased and have bounded moments. Here, $\tilde{O}_T$ hides polylog factors and constants that depend on $T$. The key challenge we address in establishing this result relates to handling three distinct sources of bias in the stochastic gradients. The first source arises from the compositional nature of the upper-level, the second stems from the bi-level structure, and the third emerges due to the utilization of Neumann series approximations to avoid matrix inversion. To demonstrate the effectiveness of our approach, we apply it to the problem of robust feature learning for deep neural networks under covariate shift, showcasing the benefits and advantages of our methodology in that context.
6.Reliable optimal controls for SEIR models in epidemiology
Authors:Simone Cacace, Alessio Oliviero
Abstract: We present and compare two different optimal control approaches applied to SEIR models in epidemiology, which allow us to obtain some policies for controlling the spread of an epidemic. The first approach uses Dynamic Programming to characterise the value function of the problem as the solution of a partial differential equation, the Hamilton-Jacobi-Bellman equation, and derive the optimal policy in feedback form. The second is based on Pontryagin's maximum principle and directly gives open-loop controls, via the solution of an optimality system of ordinary differential equations. This method, however, may not converge to the optimal solution. We propose a combination of the two methods in order to obtain high-quality and reliable solutions. Several simulations are presented and discussed.
7.Stability and genericity of bang-bang controls in affine problems
Authors:Alberto Domínguez Corella, Gerd Wachsmuth
Abstract: We analyse the role of the bang-bang property in affine optimal control problems. We show that many essential stability properties of affine problems are only satisfied when minimizers have the bang-bang property. Moreover, we prove that almost any perturbation in an affine optimal control problem leads to a bang-bang strict global minimizer. We work in an abstract framework that allows to cover many problems in the literature of optimal control, this includes problems constrained by partial and ordinary differential equations. We give examples that show the applicability of our results to specific optimal control problems.
1.Invex Programs: First Order Algorithms and Their Convergence
Authors:Adarsh Barik, Suvrit Sra, Jean Honorio
Abstract: Invex programs are a special kind of non-convex problems which attain global minima at every stationary point. While classical first-order gradient descent methods can solve them, they converge very slowly. In this paper, we propose new first-order algorithms to solve the general class of invex problems. We identify sufficient conditions for convergence of our algorithms and provide rates of convergence. Furthermore, we go beyond unconstrained problems and provide a novel projected gradient method for constrained invex programs with convergence rate guarantees. We compare and contrast our results with existing first-order algorithms for a variety of unconstrained and constrained invex problems. To the best of our knowledge, our proposed algorithm is the first algorithm to solve constrained invex programs.
2.Tropical convexity in location problems
Authors:Andrei Comăneci
Abstract: We investigate location problems whose optimum lies in the tropical convex hull of the input points. Firstly, we study geodesically star-convex sets under the asymmetric tropical distance and introduce the class of tropically quasiconvex functions whose sub-level sets have this shape. The latter are related to monotonic functions. Then we show that location problems whose distances are measured by tropically quasiconvex functions as before give an optimum in the tropical convex hull of the input points. We also show that a similar result holds if we replace the input points by tropically convex sets. Finally, we focus on applications to phylogenetics presenting properties of consensus methods arising from our class of location problems.
3.An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization
Authors:Guy Kornowski, Ohad Shamir
Abstract: We study the complexity of producing $(\delta,\epsilon)$-stationary points of Lipschitz objectives which are possibly neither smooth nor convex, using only noisy function evaluations. Recent works proposed several stochastic zero-order algorithms that solve this task, all of which suffer from a dimension-dependence of $\Omega(d^{3/2})$ where $d$ is the dimension of the problem, which was conjectured to be optimal. We refute this conjecture by providing a faster algorithm that has complexity $O(d\delta^{-1}\epsilon^{-3})$, which is optimal (up to numerical constants) with respect to $d$ and also optimal with respect to the accuracy parameters $\delta,\epsilon$, thus solving an open question due to Lin et al. (NeurIPS'22). Moreover, the convergence rate achieved by our algorithm is also optimal for smooth objectives, proving that in the nonconvex stochastic zero-order setting, nonsmooth optimization is as easy as smooth optimization. We provide algorithms that achieve the aforementioned convergence rate in expectation as well as with high probability. Our analysis is based on a simple yet powerful geometric lemma regarding the Goldstein-subdifferential set, which allows utilizing recent advancements in first-order nonsmooth nonconvex optimization.
1.Randomized subspace gradient method for constrained optimization
Authors:Ryota Nozawa, Pierre-Louis Poirion, Akiko Takeda
Abstract: We propose randomized subspace gradient methods for high-dimensional constrained optimization. While there have been similarly purposed studies on unconstrained optimization problems, there have been few on constrained optimization problems due to the difficulty of handling constraints. Our algorithms project gradient vectors onto a subspace that is a random projection of the subspace spanned by the gradients of active constraints. We determine the worst-case iteration complexity under linear and nonlinear settings and theoretically confirm that our algorithms can take a larger step size than their deterministic version. From the advantages of taking longer step and randomized subspace gradients, we show that our algorithms are especially efficient in view of time complexity when gradients cannot be obtained easily. Numerical experiments show that they tend to find better solutions because of the randomness of their subspace selection. Furthermore, they performs well in cases where gradients could not be obtained directly, and instead, gradients are obtained using directional derivatives.
2.Scylla: a matrix-free fix-propagate-and-project heuristic for mixed-integer optimization
Authors:Gioni Mexi, Mathieu Besançon, Suresh Bolusani, Antonia Chmiela, Ambros Gleixner, Alexander Hoen
Abstract: We introduce Scylla, a primal heuristic for mixed-integer optimization problems. It exploits approximate solves of the Linear Programming relaxations through the matrix-free Primal-Dual Hybrid Gradient algorithm with specialized termination criteria, and derives integer-feasible solutions via fix-and-propagate procedures and feasibility-pump-like updates to the objective function. Computational experiments show that the method is particularly suited to instances with hard linear relaxations.
3.Finite Elements with Switch Detection for Numerical Optimal Control of Nonsmooth Dynamical Systems with Set-Valued Step Functions
Authors:Armin Nurkanović, Anton Pozharskiy, Jonathan Frey, Moritz Diehl
Abstract: This paper develops high-accuracy methods for numerically solving optimal control problems subject to nonsmooth differential equations with set-valued step functions. A notable subclass of these systems are Filippov systems. The set-valued step functions are here written as the solution map of a linear program. Using the optimality conditions of this problem we rewrite the initial nonsmooth system into a equivalent dynamic complementarity systems (DCS). We extend the Finite Elements with Switch Detection (FESD) method [Nurkanovi\'c et al., 2022], initially developed for Filippov systems transformed via Stewart's reformulation into DCS [Stewart, 1990], to the class of nonsmooth systems with set-valued step functions. The key ideas are to start with a standard Runge-Kutta method for the obtained DCS and to let the integration step sizes to be degrees of freedom. Next, we introduce additional conditions to enable implicit but exact switch detection and to remove possible spurious degrees of freedom if no switches occur. The theoretical properties of the method are studied. Its favorable properties are illustrated on numerical simulation and optimal control examples. All methods introduced in this paper are implemented in the open-source software package NOSNOC.
4.Absolute value linear programming
Authors:Milan Hladík, David Hartman
Abstract: We deal with linear programming problems involving absolute values in their formulations, so that they are no more expressible as standard linear programs. The presence of absolute values causes the problems to be nonconvex and nonsmooth, so hard to solve. In this paper, we study fundamental properties on the topology and the geometric shape of the solution set, and also conditions for convexity, connectedness, boundedness and integrality of the vertices. Further, we address various complexity issues, showing that many basic questions are NP-hard to solve. We show that the feasible set is a (nonconvex) polyhedral set and, more importantly, every nonconvex polyhedral set can be described by means of absolute value constraints. We also provide a necessary and sufficient condition when a KKT point of a nonconvex quadratic programming reformulation solves the original problem.
5.Parallel drone scheduling vehicle routing problems with collective drones
Authors:Roberto Montemanni, Mauro Dell'Amico, Andrea Corsini
Abstract: We study last-mile delivery problems where trucks and drones collaborate to deliver goods to final customers. In particular, we focus on problem settings where either a single truck or a fleet with several homogeneous trucks work in parallel to drones, and drones have the capability of collaborating for delivering missions. This cooperative behaviour of the drones, which are able to connect to each other and work together for some delivery tasks, enhance their potential, since connected drone has increased lifting capabilities and can fly at higher speed, overcoming the main limitations of the setting where the drones can only work independently. In this work, we contribute a Constraint Programming model and a valid inequality for the version of the problem with one truck, namely the \emph{Parallel Drone Scheduling Traveling Salesman Problem with Collective Drones} and we introduce for the first time the variant with multiple trucks, called the \emph{Parallel Drone Scheduling Vehicle Routing Problem with Collective Drones}. For the latter variant, we propose two Constraint Programming models and a Mixed Integer Linear Programming model. An extensive experimental campaign leads to state-of-the-art results for the problem with one truck and some understanding of the presented models' behaviour on the version with multiple trucks. Some insights about future research are finally discussed.
6.The generalized Nash game proposed by Rosen
Authors:Carlos Calderón, John Cotrina
Abstract: We deal with the generalized Nash game proposed by Rosen, which is a game with strategy sets that are coupled across players through a shared constraint. A reduction to a classical game is shown, and as a consequence, Rosen's result can be deduced from the one given by Arrow and Debreu. We also establish necessary and sufficient conditions for a point to be a generalized Nash equilibrium employing the variational inequality approach. Finally, some existence results are given in the non-compact case under coerciveness conditions.
7.Time-dependent parameter identification in a Fokker-Planck equation based magnetization model of large ensembles of nanoparticles
Authors:Hannes Albers, Tobias Kluth
Abstract: In this article, we consider a model motivated by large ensembles of nanoparticles' magnetization dynamics using the Fokker-Planck equation and analyze the underlying parabolic PDE being defined on a smooth, compact manifold without boundary with respect to time-dependent parameter identification using regularization schemes. In the context of magnetic particle imaging, possible fields of application can be found including calibration procedures improved by time-dependent particle parameters and dynamic tracking of nanoparticle orientation. This results in reconstructing different parameters of interest, such as the applied magnetic field and the particles' easy axis. These problems are in particular addressed in the accompanied numerical study.
8.Absorbing games with irrational values
Authors:Miquel Oliu-Barton
Abstract: Can an absorbing game with rational data have an irrational limit value? Yes: In this note we provide the simplest examples where this phenomenon arises. That is, the following $3\times 3$ absorbing game \[ A = \begin{bmatrix} 1^* & 1^* & 2^* \\ 1^* & 2^* & 0\phantom{^*} \\ 2^* & 0\phantom{^*} & 1^* \end{bmatrix}, \] and a sequence of $2\times 2$ absorbing games whose limit values are $\sqrt{k}$, for all integer $k$. Finally, we conjecture that any algebraic number can be represented as the limit value of an absorbing game.
9.Accelerated Optimization Landscape of Linear-Quadratic Regulator
Authors:Lechen Feng, Yuan-Hua Ni
Abstract: Linear-quadratic regulator (LQR) is a landmark problem in the field of optimal control, which is the concern of this paper. Generally, LQR is classified into state-feedback LQR (SLQR) and output-feedback LQR (OLQR) based on whether the full state is obtained. It has been suggested in existing literature that both the SLQR and the OLQR could be viewed as \textit{constrained nonconvex matrix optimization} problems in which the only variable to be optimized is the feedback gain matrix. In this paper, we introduce a first-order accelerated optimization framework of handling the LQR problem, and give its convergence analysis for the cases of SLQR and OLQR, respectively. Specifically, a Lipschiz Hessian property of LQR performance criterion is presented, which turns out to be a crucial property for the application of modern optimization techniques. For the SLQR problem, a continuous-time hybrid dynamic system is introduced, whose solution trajectory is shown to converge exponentially to the optimal feedback gain with Nesterov-optimal order $1-\frac{1}{\sqrt{\kappa}}$ ($\kappa$ the condition number). Then, the symplectic Euler scheme is utilized to discretize the hybrid dynamic system, and a Nesterov-type method with a restarting rule is proposed that preserves the continuous-time convergence rate, i.e., the discretized algorithm admits the Nesterov-optimal convergence order. For the OLQR problem, a Hessian-free accelerated framework is proposed, which is a two-procedure method consisting of semiconvex function optimization and negative curvature exploitation. In a time $\mathcal{O}(\epsilon^{-7/4}\log(1/\epsilon))$, the method can find an $\epsilon$-stationary point of the performance criterion; this entails that the method improves upon the $\mathcal{O}(\epsilon^{-2})$ complexity of vanilla gradient descent. Moreover, our method provides the second-order guarantee of stationary point.
10.A second order dynamical system method for solving a maximal comonotone inclusion problem
Authors:Zengzhen Tan, Rong Hu, Yaping Fang
Abstract: In this paper a second order dynamical system model is proposed for computing a zero of a maximal comonotone operator in Hilbert spaces. Under mild conditions, we prove existence and uniqueness of a strong global solution of the proposed dynamical system. A proper tuning of the parameters can allow us to establish fast convergence properties of the trajectories generated by the dynamical system. The weak convergence of the trajectory to a zero of the maximal comonotone operator is also proved. Furthermore, a discrete version of the dynamical system is considered and convergence properties matching to that of the dynamical system are established under a same framework. Finally, the validity of the proposed dynamical system and its discrete version is demonstrated by two numerical examples.
11.Optimal Solutions for a Class of Set-Valued Evolution Problems
Authors:Stefano Bianchini, Alberto Bressan, Maria Teresa Chiri
Abstract: The paper is concerned with a class of optimization problems for moving sets $t\mapsto\Omega(t)\subset\mathbb{R}^2$, motivated by the control of invasive biological populations. Assuming that the initial contaminated set $\Omega_0$ is convex, we prove that a strategy is optimal if an only if at each given time $t\in [0,T]$ the control is active along the portion of the boundary $\partial \Omega(t)$ where the curvature is maximal. In particular, this implies that $\Omega(t)$ is convex for all $t\geq 0$. The proof relies on the analysis of a one-step constrained optimization problem, obtained by a time discretization.
12.Cascading Failures in the Global Financial System: A Dynamical Model
Authors:Leonardo Stella, Dario Bauso, Franco Blanchini, Patrizio Colaneri
Abstract: In this paper, we propose a dynamical model to capture cascading failures among interconnected organizations in the global financial system. Failures can take the form of bankruptcies, defaults, and other insolvencies. The network that underpins the financial interdependencies between different organizations constitutes the backbone of the financial system. A failure in one or more of these organizations can lead the propagation of the financial collapse onto other organizations in a domino effect. Paramount importance is therefore given to the mitigation of these failures. Motivated by the relevance of this problem and recent prominent events connected to it, we develop a framework that allows us to investigate under what conditions organizations remain healthy or are involved in the propagation of the failures in the network. The contribution of this paper is the following: i) we develop a dynamical model that describes the equity values of financial organizations and their evolution over time given an initial condition; ii) we characterize the equilibria for this model by proving the existence and uniqueness of these equilibria, and by providing an explicit expression for them; and iii) we provide a computational method via sign-space iteration to analyze the propagation of failures and the attractive equilibrium point.
13.Tikhonov regularized second-order plus first-order primal-dual dynamical systems with asymptotically vanishing damping for linear equality constrained convex optimization problems
Authors:Ting Ting Zhu, Rong Hu, Ya Ping Fang
Abstract: In this paper, in the setting of Hilbert spaces, we consider a Tikhonov regularized second-order plus first-order primal-dual dynamical system with asymptotically vanishing damping for a linear equality constrained convex optimization problem. The convergence properties of the proposed dynamical system depend heavily upon the choice of the Tikhonov regularization parameter. When the Tikhonov regularization parameter decreases rapidly to zero, we establish the fast convergence rates of the primal-dual gap, the objective function error, the feasibility measure, and the gradient norm of the objective function along the trajectory generated by the system. When the Tikhonov regularization parameter tends slowly to zero, we prove that the primal trajectory of the Tikhonov regularized dynamical system converges strongly to the minimal norm solution of the linear equality constrained convex optimization problem. Numerical experiments are performed to illustrate the efficiency of our approach.
14.On the Geometry and Refined Rate of Primal-Dual Hybrid Gradient for Linear Programming
Authors:Haihao Lu, Jinwen Yang
Abstract: We study the convergence behaviors of primal-dual hybrid gradient (PDHG) for solving linear programming (LP). PDHG is the base algorithm of a new general-purpose first-order method LP solver, PDLP, which aims to scale up LP by taking advantage of modern computing architectures. Despite its numerical success, the theoretical understanding of PDHG for LP is still very limited; the previous complexity result relies on the global Hoffman constant of the KKT system, which is known to be very loose and uninformative. In this work, we aim to develop a fundamental understanding of the convergence behaviors of PDHG for LP and to develop a refined complexity rate that does not rely on the global Hoffman constant. We show that there are two major stages of PDHG for LP: in Stage I, PDHG identifies active variables and the length of the first stage is driven by a certain quantity which measures how close the non-degeneracy part of the LP instance is to degeneracy; in Stage II, PDHG effectively solves a homogeneous linear inequality system, and the complexity of the second stage is driven by a well-behaved local sharpness constant of the system. This finding is closely related to the concept of partial smoothness in non-smooth optimization, and it is the first complexity result of finite time identification without the non-degeneracy assumption. An interesting implication of our results is that degeneracy itself does not slow down the convergence of PDHG for LP, but near-degeneracy does.
15.Bilateral boundary control of an input delayed 2-D reaction-diffusion equation
Authors:Dandan Guan, Yanmei Chen, Jie Qi, Linglong Du
Abstract: In this paper, a delay compensation design method based on PDE backstepping is developed for a two-dimensional reaction-diffusion partial differential equation (PDE) with bilateral input delays. The PDE is defined in a rectangular domain, and the bilateral control is imposed on a pair of opposite sides of the rectangle. To represent the delayed bilateral inputs, we introduce two 2-D transport PDEs that form a cascade system with the original PDE. A novel set of backstepping transformations is proposed for delay compensator design, including one Volterra integral transformation and two affine Volterra integral transformations. Unlike the kernel equation for 1-D PDE systems with delayed boundary input, the resulting kernel equations for the 2-D system have singular initial conditions governed by the Dirac Delta function. Consequently, the kernel solutions are written as a double trigonometric series with singularities. To address the challenge of stability analysis posed by the singularities, we prove a set of inequalities by using the Cauchy-Schwarz inequality, the 2-D Fourier series, and the Parseval's theorem. A numerical simulation illustrates the effectiveness of the proposed delay-compensation method.
16.Symmetry reduction and recovery of trajectories of optimal control problems via measure relaxations
Authors:Nicolas Augier, Didier Henrion, Milan Korda, Victor Magron
Abstract: We address the problem of symmetry reduction of optimal control problems under the action of a finite group from a measure relaxation viewpoint. We propose a method based on the moment-SOS aka Lasserre hierarchy which allows one to significantly reduce the computation time and memory requirements compared to the case without symmetry reduction. We show that the recovery of optimal trajectories boils down to solving a symmetric parametric polynomial system. Then we illustrate our method on the symmetric integrator and the time-optimal inversion of qubits.
1.A generalized Routh-Hurwitz criterion for the stability analysis of polynomials with complex coefficients: application to the PI-control of vibrating structures
Authors:Anthony Hastir, Riccardo Muolo
Abstract: The classical Routh-Hurwitz criterion is one of the most popular methods to study the stability of polynomials with real coefficients, given its simplicity and ductility. However, when moving to polynomials with complex coefficients, a generalization exists but it is rather cumbersome and not as easy to apply. In this paper, we make such generalization clear and understandable for a wider public and develop an algorithm to apply it. After having explained the method, we demonstrate its use to determine the external stability of a system consisting of the interconnection between a rotating shaft and a PI-regulator. The extended Routh-Hurwitz criterion gives then necessary and sufficient conditions on the gains of the PI-regulator to achieve stabilization of the system together with regulation of the output. This illustrative example makes our formulation of the extended Routh-Hurwitz criterion ready to be used in several other applications.
2.Benign landscapes of low-dimensional relaxations for orthogonal synchronization on general graphs
Authors:Andrew D. McRae, Nicolas Boumal
Abstract: Orthogonal group synchronization is the problem of estimating $n$ elements $Z_1, \ldots, Z_n$ from the orthogonal group $\mathrm{O}(r)$ given some relative measurements $R_{ij} \approx Z_i^{}Z_j^{-1}$. The least-squares formulation is nonconvex. To avoid its local minima, a Shor-type convex relaxation squares the dimension of the optimization problem from $O(n)$ to $O(n^2)$. Burer--Monteiro-type nonconvex relaxations have generic landscape guarantees at dimension $O(n^{3/2})$. For smaller relaxations, the problem structure matters. It has been observed in the robotics literature that nonconvex relaxations of only slightly increased dimension seem sufficient for SLAM problems. We partially explain this. This also has implications for Kuramoto oscillators. Specifically, we minimize the least-squares cost function in terms of estimators $Y_1, \ldots, Y_n$. Each $Y_i$ is relaxed to the Stiefel manifold $\mathrm{St}(r, p)$ of $r \times p$ matrices with orthonormal rows. The available measurements implicitly define a (connected) graph $G$ on $n$ vertices. In the noiseless case, we show that second-order critical points are globally optimal as soon as $p \geq r+2$ for all connected graphs $G$. (This implies that Kuramoto oscillators on $\mathrm{St}(r, p)$ synchronize for all $p \geq r + 2$.) This result is the best possible for general graphs; the previous best known result requires $2p \geq 3(r + 1)$. For $p > r + 2$, our result is robust to modest amounts of noise (depending on $p$ and $G$). When local minima remain, they still achieve minimax-optimal error rates. Our proof uses a novel randomized choice of tangent direction to prove (near-)optimality of second-order critical points. Finally, we partially extend our noiseless landscape results to the complex case (unitary group), showing that there are no spurious local minima when $2p \geq 3r$.
3.Stochastic Approximation for Expectation Objective and Expectation Inequality-Constrained Nonconvex Optimization
Authors:Francisco Facchinei, Vyacheslav Kungurtsev
Abstract: Stochastic Approximation has been a prominent set of tools for solving problems with noise and uncertainty. Increasingly, it becomes important to solve optimization problems wherein there is noise in both a set of constraints that a practitioner requires the system to adhere to, as well as the objective, which typically involves some empirical loss. We present the first stochastic approximation approach for solving this class of problems using the Ghost framework of incorporating penalty functions for analysis of a sequential convex programming approach together with a Monte Carlo estimator of nonlinear maps. We provide almost sure convergence guarantees and demonstrate the performance of the procedure on some representative examples.
4.Constraint Programming models for the parallel drone scheduling vehicle routing problem
Authors:Roberto Montemanni, Mauro Dell'Amico
Abstract: Drones are currently seen as a viable way for improving the distribution of parcels in urban and rural environments, while working in coordination with traditional vehicles like trucks. In this paper we consider the parallel drone scheduling vehicle routing problem, where the service of a set of customers requiring a delivery is split between a fleet of trucks and a fleet of drones. We consider two variations of the problem. In the first one the problem is more theoretical, and the target is the minimization of the time required to complete the service and have all the vehicles back to the depot. In the second variant more realistic constraints involving operating costs, capacity limitation and workload balance, are considered, and the target is to minimize the total operational costs. We propose several constraint programming models to deal with the two problems. An experimental champaign on the instances previously adopted in the literature is presented to validate the new solving methods. The results show that on top of being a viable way to solve problems to optimality, the models can also be used to derive effective heuristic solutions and high-quality lower bounds for the optimal cost, if the execution is interrupted after its natural end.
5.Convergence rate of entropy-regularized multi-marginal optimal transport costs
Authors:Luca Nenna, Paul Pegon
Abstract: We investigate the convergence rate of multi-marginal optimal transport costs that are regularized with the Boltzmann-Shannon entropy, as the noise parameter $\varepsilon$ tends to $0$. We establish lower and upper bounds on the difference with the unregularized cost of the form $C\varepsilon\log(1/\varepsilon)+O(\varepsilon)$ for some explicit dimensional constants $C$ depending on the marginals and on the ground cost, but not on the optimal transport plans themselves. Upper bounds are obtained for Lipschitz costs or locally semi-concave costs for a finer estimate, and lower bounds for $\mathcal{C}^2$ costs satisfying some signature condition on the mixed second derivatives that may include degenerate costs, thus generalizing results previously in the two marginals case and for non-degenerate costs. We obtain in particular matching bounds in some typical situations where the optimal plan is deterministic.
6.Exploratory mean-variance portfolio selection with Choquet regularizers
Authors:Junyi Guo, Xia Han, Hao Wang
Abstract: In this paper, we study a continuous-time exploratory mean-variance (EMV) problem under the framework of reinforcement learning (RL), and the Choquet regularizers are used to measure the level of exploration. By applying the classical Bellman principle of optimality, the Hamilton-Jacobi-Bellman equation of the EMV problem is derived and solved explicitly via maximizing statically a mean-variance constrained Choquet regularizer. In particular, the optimal distributions form a location-scale family, whose shape depends on the choices of the Choquet regularizer. We further reformulate the continuous-time Choquet-regularized EMV problem using a variant of the Choquet regularizer. Several examples are given under specific Choquet regularizers that generate broadly used exploratory samplers such as exponential, uniform and Gaussian. Finally, we design a RL algorithm to simulate and compare results under the two different forms of regularizers.
7.Distributed Interior Point Methods for Optimization in Energy Networks
Authors:Alexander Engelmann, Michael Kaupmann, Timm Faulwasser
Abstract: This note discusses an essentially decentralized interior point method, which is well suited for optimization problems arising in energy networks. Advantages of the proposed method are guaranteed and fast local convergence also for problems with non-convex constraints. Moreover, our method exhibits a small communication footprint and it achieves a comparably high solution accuracy with a limited number of iterations, whereby the local subproblems are of low computational complexity. We illustrate the performance of the proposed method on a problem from energy systems, i.e., we consider an optimal power flow problem with 708 buses.
8.Convergence Properties of Newton's Method for Globally Optimal Free Flight Trajectory Optimization
Authors:Ralf Borndörfer, Fabian Danecker, Martin Weiser
Abstract: The algorithmic efficiency of Newton-based methods for Free Flight Trajectory Optimization is heavily influenced by the size of the domain of convergence. We provide numerical evidence that the convergence radius is much larger in practice than what the theoretical worst case bounds suggest. The algorithm can be further improved by a convergence-enhancing domain decomposition.
9.Multiplicative Updates for Online Convex Optimization over Symmetric Cones
Authors:Ilayda Canyakmaz, Wayne Lin, Georgios Piliouras, Antonios Varvitsiotis
Abstract: We study online convex optimization where the possible actions are trace-one elements in a symmetric cone, generalizing the extensively-studied experts setup and its quantum counterpart. Symmetric cones provide a unifying framework for some of the most important optimization models, including linear, second-order cone, and semidefinite optimization. Using tools from the field of Euclidean Jordan Algebras, we introduce the Symmetric-Cone Multiplicative Weights Update (SCMWU), a projection-free algorithm for online optimization over the trace-one slice of an arbitrary symmetric cone. We show that SCMWU is equivalent to Follow-the-Regularized-Leader and Online Mirror Descent with symmetric-cone negative entropy as regularizer. Using this structural result we show that SCMWU is a no-regret algorithm, and verify our theoretical results with extensive experiments. Our results unify and generalize the analysis for the Multiplicative Weights Update method over the probability simplex and the Matrix Multiplicative Weights Update method over the set of density matrices.
10.Extreme occupation measures in Markov decision processes with a cemetery
Authors:Alexey Piunovskiy, Yi Zhang
Abstract: In this paper, we consider a Markov decision process (MDP) with a Borel state space $\textbf{X}\cup\{\Delta\}$, where $\Delta$ is an absorbing state (cemetery), and a Borel action space $\textbf{A}$. We consider the space of finite occupation measures restricted on $\textbf{X}\times \textbf{A}$, and the extreme points in it. It is possible that some strategies have infinite occupation measures. Nevertheless, we prove that every finite extreme occupation measure is generated by a deterministic stationary strategy. Then, for this MDP, we consider a constrained problem with total undiscounted criteria and $J$ constraints, where the cost functions are nonnegative. By assumption, the strategies inducing infinite occupation measures are not optimal. Then, our second main result is that, under mild conditions, the solution to this constrained MDP is given by a mixture of no more than $J+1$ occupation measures generated by deterministic stationary strategies.
11.Convergence of the momentum method for semi-algebraic functions with locally Lipschitz gradients
Authors:Cédric Josz, Lexiao Lai, Xiaopeng Li
Abstract: We propose a new length formula that governs the iterates of the momentum method when minimizing differentiable semi-algebraic functions with locally Lipschitz gradients. It enables us to establish local convergence, global convergence, and convergence to local minimizers without assuming global Lipschitz continuity of the gradient, coercivity, and a global growth condition, as is done in the literature. As a result, we provide the first convergence guarantee of the momentum method starting from arbitrary initial points when applied to principal component analysis, matrix sensing, and linear neural networks.
1.A Mini-Batch Quasi-Newton Proximal Method for Constrained Total-Variation Nonlinear Image Reconstruction
Authors:Tao Hong, Thanh-an Pham, Irad Yavneh, Michael Unser
Abstract: Over the years, computational imaging with accurate nonlinear physical models has drawn considerable interest due to its ability to achieve high-quality reconstructions. However, such nonlinear models are computationally demanding. A popular choice for solving the corresponding inverse problems is accelerated stochastic proximal methods (ASPMs), with the caveat that each iteration is expensive. To overcome this issue, we propose a mini-batch quasi-Newton proximal method (BQNPM) tailored to image-reconstruction problems with total-variation regularization. It involves an efficient approach that computes a weighted proximal mapping at a cost similar to that of the proximal mapping in ASPMs. However, BQNPM requires fewer iterations than ASPMs to converge. We assess the performance of BQNPM on three-dimensional inverse-scattering problems with linear and nonlinear physical models. Our results on simulated and real data show the effectiveness and efficiency of BQNPM,
2.Mixed Leader-Follower Dynamics
Authors:Hsin-Lun Li
Abstract: The original Leader-Follower (LF) model partitions all agents whose opinion is a number in $[-1,1]$ to a follower group, a leader group with a positive target opinion in $[0,1]$ and a leader group with a negative target opinion in $[-1,0]$. A leader group agent has a constant degree to its target and mixes it with the average opinion of its group neighbors at each update. A follower has a constant degree to the average opinion of the opinion neighbors of each leader group and mixes it with the average opinion of its group neighbors at each update. In this paper, we consider a variant of the LF model, namely the mixed model, in which the degrees can vary over time, the opinions can be high dimensional, and the number of leader groups can be more than two. We investigate circumstances under which all agents achieve a consensus. In particular, a few leaders can dominate the whole population.
3.Ill-posed linear inverse problems with box constraints: A new convex optimization approach
Authors:Henryk Gzyl
Abstract: Consider the linear equation $\mathbf{A}\mathbf{x}=\mathbf{y}$, where $\mathbf{A}$ is a $k\times N$-matrix, $\mathbf{x}\in\mathcal{K}\subset \mathbb{R}^N$ and $\mathbf{y}\in\mathbb{R}^M$ a given vector. When $\mathcal{K}$ is a convex set and $M\not= N$ this is a typical ill-posed, linear inverse problem with convex constraints. Here we propose a new way to solve this problem when $\mathcal{K} = \prod_j[a_j,b_j]$. It consists of regarding $\mathbf{A}\mathbf{x}=\mathbf{y}$ as the constraint of a convex minimization problem, in which the objective (cost) function is the dual of a moment generating function. This leads to a nice minimization problem and some interesting comparison results. More importantly, the method provides a solution that lies in the interior of the constraint set $\mathcal{K}$. We also analyze the dependence of the solution on the data and relate it to the Le Chatellier principle.
4.From NeurODEs to AutoencODEs: a mean-field control framework for width-varying Neural Networks
Authors:Cristina Cipriani, Massimo Fornasier, Alessandro Scagliotti
Abstract: In our work, we build upon the established connection between Residual Neural Networks (ResNets) and continuous-time control systems known as NeurODEs. By construction, NeurODEs have been limited to constant-width layers, making them unsuitable for modeling deep learning architectures with width-varying layers. In this paper, we propose a continuous-time Autoencoder, which we call AutoencODE, and we extend to this case the mean-field control framework already developed for usual NeurODEs. In this setting, we tackle the case of low Tikhonov regularization, resulting in potentially non-convex cost landscapes. While the global results obtained for high Tikhonov regularization may not hold globally, we show that many of them can be recovered in regions where the loss function is locally convex. Inspired by our theoretical findings, we develop a training method tailored to this specific type of Autoencoders with residual connections, and we validate our approach through numerical experiments conducted on various examples.
5.Extended team orienteering problem: Algorithms and applications
Authors:Wen Ji, Ke Han, Qian Ge
Abstract: The team orienteering problem (TOP) determines a set of routes, each within a time or distance budget, which collectively visit a set of points of interest (POIs) such that the total score collected at those visited points are maximized. This paper proposes an extension of the TOP (ETOP) by allowing the POIs to be visited multiple times to accumulate scores. Such an extension is necessary for application scenarios like urban sensing where each POI needs to be continuously monitored, or disaster relief where certain locations need to be repeatedly covered. We present two approaches to solve the ETOP, one based on the adaptive large neighborhood search (ALNS) algorithm and the other being a bi-level matheuristic method. Sensitivity analyses are performed to fine-tune the algorithm parameters. Test results on complete graphs with different problem sizes show that: (1) both algorithms significantly outperform a greedy heuristic, with improvements ranging from 9.43% to 27.68%; and (2) while the ALNS-based algorithm slightly outperform the matheuristic in terms of solution optimality, the latter is far more computationally efficient, by 11 to 385 times faster. Finally, a real-world case study of VOCs sensing is presented and formulated as ETOP on a road network (incomplete graph), where the ALNS is outperformed by matheuristic in terms of optimality as the destroy and repair operators yield limited perturbation of existing solutions when constrained by a road network.
6.QUBO.jl: A Julia Ecosystem for Quadratic Unconstrained Binary Optimization
Authors:Pedro Maciel Xavier, Pedro Ripper, Tiago Andrade, Joaquim Dias Garcia, Nelson Maculan, David E. Bernal Neira
Abstract: We present QUBO.jl, an end-to-end Julia package for working with QUBO (Quadratic Unconstrained Binary Optimization) instances. This tool aims to convert a broad range of JuMP problems for straightforward application in many physics and physics-inspired solution methods whose standard optimization form is equivalent to the QUBO. These methods include quantum annealing, quantum gate-circuit optimization algorithms (Quantum Optimization Alternating Ansatz, Variational Quantum Eigensolver), other hardware-accelerated platforms, such as Coherent Ising Machines and Simulated Bifurcation Machines, and more traditional methods such as simulated annealing. Besides working with reformulations, QUBO.jl allows its users to interface with the aforementioned hardware, sending QUBO models in various file formats and retrieving results for subsequent analysis. QUBO.jl was written as a JuMP / MathOptInterface (MOI) layer that automatically maps between the input and output frames, thus providing a smooth modeling experience.
7.AI4OPT: AI Institute for Advances in Optimization
Authors:Pascal Van Hentenryck, Kevin Dalmeijer
Abstract: This article is a short introduction to AI4OPT, the NSF AI Institute for Advances in Optimization. AI4OPT fuses AI and Optimization, inspired by end-use cases in supply chains, energy systems, chip design and manufacturing, and sustainable food systems. AI4OPT also applies its "teaching the teachers" philosophy to provide longitudinal educational pathways in AI for engineering.
1.Accelerated stochastic approximation with state-dependent noise
Authors:Sasila Ilandarideva, Anatoli Juditsky, Guanghui Lan, Tianjiao Li
Abstract: We consider a class of stochastic smooth convex optimization problems under rather general assumptions on the noise in the stochastic gradient observation. As opposed to the classical problem setting in which the variance of noise is assumed to be uniformly bounded, herein we assume that the variance of stochastic gradients is related to the "sub-optimality" of the approximate solutions delivered by the algorithm. Such problems naturally arise in a variety of applications, in particular, in the well-known generalized linear regression problem in statistics. However, to the best of our knowledge, none of the existing stochastic approximation algorithms for solving this class of problems attain optimality in terms of the dependence on accuracy, problem parameters, and mini-batch size. We discuss two non-Euclidean accelerated stochastic approximation routines--stochastic accelerated gradient descent (SAGD) and stochastic gradient extrapolation (SGE)--which carry a particular duality relationship. We show that both SAGD and SGE, under appropriate conditions, achieve the optimal convergence rate, attaining the optimal iteration and sample complexities simultaneously. However, corresponding assumptions for the SGE algorithm are more general; they allow, for instance, for efficient application of the SGE to statistical estimation problems under heavy tail noises and discontinuous score functions. We also discuss the application of the SGE to problems satisfying quadratic growth conditions, and show how it can be used to recover sparse solutions. Finally, we report on some simulation experiments to illustrate numerical performance of our proposed algorithms in high-dimensional settings.
2.Exponential stability of Euler-Bernoulli beam under boundary controls in rotation and angular velocity
Authors:Alemdar Hasanov
Abstract: This paper addresses the analysis of a boundary feedback system involving a non-homogeneous Euler-Bernoulli beam governed by the equation $m(x)u_{tt}+\mu(x)u_{t}$$+\left(r(x)u_{xx}\right)_{xx}=0$, subject to the initial $u(x,0)=u_0(x)$, $u_t(x,0)=v_0(x)$ and boundary conditions $u(0,t)=0$, $\left (-r(x)u_{xx}(x,t)\right )_{x=0}=-k^{-}_r u_{x}(0,t)-k^{-}_a u_{xt}(0,t)$, $u(\ell,t)=0$, $\left (-r(x)u_{xx}(x,t)\right )_{x=\ell}=-k^{+}_r u_{x}(\ell,t)-k^{+}_a u_{xt}(\ell,t)$, with boundary control at both ends resulting from the rotation and angular velocity. The approach proposed in this study relies on the utilization of regular weak solutions, energy identity, and a physically motivated Lyapunov function. By imposing natural assumptions concerning physical parameters and other inputs, which ensure the existence of a regular weak solution, we successfully derive a uniform exponential decay estimate for the system's energy. The decay rate constant featured in this estimate is solely dependent on the physical and geometric properties of the beam. These properties encompass crucial parameters such as the viscous external damping coefficient $\mu(x)$, as well as the boundary springs $k^{-}_r,k^+_r $ and dampers $k^{-}_a,k^+_a$. To illustrate the practical effectiveness of our theoretical findings, numerical examples are provided. These examples serve to demonstrate the applicability and relevance of our derived results in real-world scenarios.
3.Strong stability of convexity with respect to the perimeter
Authors:Alessio Figalli, Yi Ru-Ya Zhang
Abstract: Let $E\subset \mathbb R^n$, $n\ge 2$, be a set of finite perimeter with $|E|=|B|$, where $B$ denotes the unit ball. When $n=2$, since convexification decreases perimeter (in the class of open connected sets), it is easy to prove the existence of a convex set $F$, with $|E|=|F|$, such that $$ P(E) - P(F) \ge c\,|E\Delta F|, \qquad c>0. $$ Here we prove that, when $n\ge 3$, there exists a convex set $F$, with $|E|=|F|$, such that $$ P(E) - P(F) \ge c(n) \,f\big(|E\Delta F|\big), \qquad c(n)>0,\qquad f(t)=\frac{t}{|\log t|} \text{ for }t \ll 1. $$ Moreover, one can choose $F$ to be a small $C^2$-deformation of the unit ball. Furthermore, this estimate is essentially sharp as we can show that the inequality above fails for $f(t)=t.$ Interestingly, the proof of our result relies on a new stability estimate for Alexandrov's Theorem on constant mean curvature sets.
4.Decentralized optimization with affine constraints over time-varying networks
Authors:Demyan Yarmoshik, Alexander Rogozin, Alexander Gasnikov
Abstract: The decentralized optimization paradigm assumes that each term of a finite-sum objective is privately stored by the corresponding agent. Agents are only allowed to communicate with their neighbors in the communication graph. We consider the case when the agents additionally have local affine constraints and the communication graph can change over time. We provide the first linearly convergent decentralized algorithm for time-varying networks by generalizing the optimal decentralized algorithm ADOM to the case of affine constraints. We show that its rate of convergence is optimal for first-order methods by providing the lower bounds for the number of communications and oracle calls.
5.Wasserstein medians: robustness, PDE characterization and numerics
Authors:Guillaume Carlier, Enis Chenchene, Katharina Eichinger
Abstract: We investigate the notion of Wasserstein median as an alternative to the Wasserstein barycenter, which has become popular but may be sensitive to outliers. In terms of robustness to corrupted data, we indeed show that Wasserstein medians have a breakdown point of approximately $\frac{1}{2}$. We give explicit constructions of Wasserstein medians in dimension one which enable us to obtain $L^p$ estimates (which do not hold in higher dimensions). We also address dual and multimarginal reformulations. In convex subsets of $\mathbb{R}^d$, we connect Wasserstein medians to a minimal (multi) flow problem \`a la Beckmann and a system of PDEs of Monge-Kantorovich-type, for which we propose a $p$-Laplacian approximation. Our analysis eventually leads to a new numerical method to compute Wasserstein medians, which is based on a Douglas-Rachford scheme applied to the minimal flow formulation of the problem.
6.Assessing the impact of Higher Order Network Structure on Tightness of OPF Relaxation
Authors:Nafis Sadik, Mohammad Rasoul Narimani
Abstract: AC optimal power flow (AC OPF) is a fundamental problem in power system operation and control. Accurately modeling the network physics via the AC power flow equations makes AC OPF a challenging nonconvex problem that results in significant computational challenges. To search for global optima, recent research has developed a variety of convex relaxations to bound the optimal objective values of AC OPF problems. However, the quality of these bounds varies for different test cases, suggesting that OPF problems exhibit a range of difficulties. Understanding this range of difficulty is helpful for improving relaxation algorithms. Power grids are naturally represented as graphs, with buses as nodes and power lines as edges. Graph theory offers various methods to measure power grid graphs, enabling researchers to characterize system structure and optimize algorithms. Leveraging graph theory-based algorithms, this paper presents an empirical study aiming to find correlations between optimality gaps and local structures in the underlying test case's graph. Network graphlets, which are induced subgraphs of a network, are used to investigate the correlation between power system topology and OPF relaxation tightness. Specifically, this paper examines how the existence of particular graphlets that are either too frequent or infrequent in the power system graph affects the tightness of the OPF convex relaxation. Numerous test cases are analyzed from a local structural perspective to establish a correlation between their topology and their OPF convex relaxation tightness.
7.Impact of Higher-Order Structures in Power Grids' Graph on Line Outage Distribution Factor
Authors:Nafis Sadik, Mohammad Rasoul Narimani
Abstract: Power systems often include a specific set of lines that are crucial for the regular operations of the grid. Identifying the reasons behind the criticality of these lines is an important challenge in power system studies. When a line fails, the line outage distribution factor (LODF) quantifies the changes in power flow on the remaining lines. This paper proposes a network analysis from a local structural perspective to investigate the impact of local structural patterns in the underlying graph of power systems on the LODF of individual lines. In particular, we focus on graphlet analysis to determine the local structural properties of each line. This research analyzes potential connections between specific graphlets and the most critical lines based on their LODF. In this regard, we investigate N-1 and N-2 contingency analysis for various test cases and identifies the lines that have the greatest impact on the LODFs of other lines. We then determine which subgraphs contain the most significant lines. Our findings reveal that the most critical lines often belong to subgraphs with a less meshed but more radial structure. These findings are further validated through various test cases. Particularly, it is observed that networks with a higher percentage of ring or meshed subgraphs on their most important line (based on LODF) experience a lower LODF when that critical line is subject to an outage. Additionally, we investigate how the LODF of the most critical line varies among different test cases and examine the subgraph characteristics of those critical lines.
1.Quantifying Distributional Model Risk in Marginal Problems via Optimal Transport
Authors:Yanqin Fan, Hyeonseok Park, Gaoqian Xu
Abstract: This paper studies distributional model risk in marginal problems, where each marginal measure is assumed to lie in a Wasserstein ball centered at a fixed reference measure with a given radius. Theoretically, we establish several fundamental results including strong duality, finiteness of the proposed Wasserstein distributional model risk, and the existence of an optimizer at each radius. In addition, we show continuity of the Wasserstein distributional model risk as a function of the radius. Using strong duality, we extend the well-known Makarov bounds for the distribution function of the sum of two random variables with given marginals to Wasserstein distributionally robust Markarov bounds. Practically, we illustrate our results on four distinct applications when the sample information comes from multiple data sources and only some marginal reference measures are identified. They are: partial identification of treatment effects; externally valid treatment choice via robust welfare functions; Wasserstein distributionally robust estimation under data combination; and evaluation of the worst aggregate risk measures.
2.Variational theory and algorithms for a class of asymptotically approachable nonconvex problems
Authors:Hanyang Li, Ying Cui
Abstract: We investigate a class of composite nonconvex functions, where the outer function is the sum of univariate extended-real-valued convex functions and the inner function is the limit of difference-of-convex functions. A notable feature of this class is that the inner function can be merely lower semicontinuous instead of continuous. It covers a range of important yet challenging applications, including the composite value functions of nonlinear programs, the weighted value-at-risk for continuously distributed random variables, and composite rank functions. We propose an asymptotic decomposition of the composite function that guarantees epi-convergence to the original function, leading to necessary optimality conditions for the corresponding minimization problems. The proposed decomposition also enables us to design a numerical algorithm that is provably convergent to a point satisfying the newly introduced optimality conditions. These results expand on the study of so-called amenable functions introduced by Poliquin and Rockafellar in 1992, which are compositions of convex functions with smooth maps, and the prox-linear methods for their minimization.
3.Monte Carlo Policy Gradient Method for Binary Optimization
Authors:Cheng Chen, Ruitao Chen, Tianyou Li, Ruichen Ao, Zaiwen Wen
Abstract: Binary optimization has a wide range of applications in combinatorial optimization problems such as MaxCut, MIMO detection, and MaxSAT. However, these problems are typically NP-hard due to the binary constraints. We develop a novel probabilistic model to sample the binary solution according to a parameterized policy distribution. Specifically, minimizing the KL divergence between the parameterized policy distribution and the Gibbs distributions of the function value leads to a stochastic optimization problem whose policy gradient can be derived explicitly similar to reinforcement learning. For coherent exploration in discrete spaces, parallel Markov Chain Monte Carlo (MCMC) methods are employed to sample from the policy distribution with diversity and approximate the gradient efficiently. We further develop a filter scheme to replace the original objective function by the one with the local search technique to broaden the horizon of the function landscape. Convergence to stationary points in expectation of the policy gradient method is established based on the concentration inequality for MCMC. Numerical results show that this framework is very promising to provide near-optimal solutions for quite a few binary optimization problems.
4.Global stabilization of sterile insect technique model by feedback laws
Authors:Kala Agbo Bidi LJLL, Luis Almeida LJLL, Jean-Michel Coron LJLL
Abstract: The Sterile Insect Technique or SIT is presently one of the most ecological methods for controlling insect pests responsible for disease transmission or crop destruction worldwide. This technique consists of releasing sterile males into the insect pest population. This approach aims at reducing fertility in the population and, consequently, reduce significantly the native insect population after a few generations. In this work, we study the global stabilization of a pest population at extinction equilibrium by the SIT method. We construct explicit feedback laws that stabilize the model and do numerical simulations to show the efficiency of our feedback laws. The different feedback laws are also compared taking into account their possible implementation in field interventions.
5.Minimal-time nonlinear control via semi-infinite programming
Authors:Antoine Oustry OptimiX, LIX, ENPC, Matteo Tacchi GIPSA-lab
Abstract: We address the problem of computing a control for a time-dependent nonlinear system to reach a target set in a minimal time. To solve this minimal time control problem, we introduce a hierarchy of linear semi-infinite programs, the values of which converge to the value of the control problem. These semi-infinite programs are increasing restrictions of the dual of the nonlinear control problem, which is a maximization problem over the subsolutions of the Hamilton-Jacobi-Bellman (HJB) equation. Our approach is compatible with generic dynamical systems and state constraints. Specifically, we use an oracle that, for a given differentiable function, returns a point at which the function violates the HJB inequality. We solve the semi-infinite programs using a classical convex optimization algorithm with a convergence rate of O(1/k), where k is the number of calls to the oracle. This algorithm yields subsolutions of the HJB equation that approximate the value function and provide a lower bound on the optimal time. We study the closed-loop control built on the obtained approximate value functions, and we give theoretical guarantees on its performance depending on the approximation error for the value function. We show promising numerical results for three non-polynomial systems with up to 6 state variables and 5 control variables.
6.Coefficient Control of Variational Inequalities
Authors:Andreas Hehl, Denis Khimin, Ira Neitzel, Nicolai Simon, Thomas Wick, Winnifried Wollner
Abstract: Within this chapter, we discuss control in the coefficients of an obstacle problem. Utilizing tools from H-convergence, we show existence of optimal solutions. First order necessary optimality conditions are obtained after deriving directional differentiability of the coefficient to solution mapping for the obstacle problem. Further, considering a regularized obstacle problem as a constraint yields a limiting optimality system after proving, strong, convergence of the regularized control and state variables. Numerical examples underline convergence with respect to the regularization. Finally, some numerical experiments highlight the possible extension of the results to coefficient control in phase-field fracture.
7.On the stochastic inventory problem under order capacity constraints
Authors:Roberto Rossi, Zhen Chen, S. Armagan Tarim
Abstract: We consider the single-item single-stocking location stochastic inventory system under a fixed ordering cost component. A long-standing problem is that of determining the structure of the optimal control policy when this system is subject to order quantity capacity constraints; to date, only partial characterisations of the optimal policy have been discussed. An open question is whether a policy with a single continuous interval over which ordering is prescribed is optimal for this problem. Under the so-called "continuous order property" conjecture, we show that the optimal policy takes the modified multi-$(s,S)$ form. Moreover, we provide a numerical counterexample in which the continuous order property is violated, and hence show that a modified multi-$(s,S)$ policy is not optimal in general. However, in an extensive computational study, we show that instances violating the continuous order property are extremely rare in practice, and that the plans generated by a modified multi-$(s,S)$ policy can therefore be considered, for all practical purposes, optimal. Finally, we show that a modified $(s,S)$ policy also performs well in practice.
8.Fast Convergence of Inertial Multiobjective Gradient-like Systems with Asymptotic Vanishing Damping
Authors:Konstantin Sonntag, Sebastian Peitz
Abstract: We present a new gradient-like dynamical system related to unconstrained convex smooth multiobjective optimization which involves inertial effects and asymptotic vanishing damping. To the best of our knowledge, this system is the first inertial gradient-like system for multiobjective optimization problems including asymptotic vanishing damping, expanding the ideas laid out in [H. Attouch and G. Garrigos, Multiobjective optimization: an inertial approach to Pareto optima, preprint, arXiv:1506.02823, 201]. We prove existence of solutions to this system in finite dimensions and further prove that its bounded solutions converge weakly to weakly Pareto optimal points. In addition, we obtain a convergence rate of order $O(t^{-2})$ for the function values measured with a merit function. This approach presents a good basis for the development of fast gradient methods for multiobjective optimization.
9.Feasibility problems via paramonotone operators in a convex setting
Authors:J. Camacho, M. J. Cánovas, J. E. Martínez-Legaz, J. Parra
Abstract: This paper is focused on some properties of paramonotone operators on Banach spaces and their application to certain feasibility problems for convex sets in a Hilbert space and convex systems in the Euclidean space. In particular, it shows that operators that are simultaneously paramonotone and bimonotone are constant on their domains, and this fact is applied to tackle two particular situations. The first one, closely related to simultaneous projections, deals with a finite amount of convex sets with an empty intersection and tackles the problem of finding the smallest perturbations (in the sense of translations) of these sets to reach a nonempty intersection. The second is focused on the distance to feasibility; specifically, given an inconsistent convex inequality system, our goal is to compute/estimate the smallest right-hand side perturbations that reach feasibility. We advance that this work derives lower and upper estimates of such a distance, which become the exact value when confined to linear systems.
10.Stochastic Recursive Optimal Control of McKean-Vlasov Type: A Viscosity Solution Approach
Authors:Liangquan Zhang
Abstract: In this paper, we study a kind of optimal control problem for forward-backward stochastic differential equations (FBSDEs for short) of McKean--Vlasov type via the dynamic programming principle (DPP for short) motivated by studying the infinite dimensional Hamilton--Jacobi--Bellman (HJB for short) equation derived from the decoupling field of the FBSDEs posed by Carmona and Delarue (\emph{Ann Probab}, 2015, \cite{cd15}). At the beginning, by considering the cost functional defined by the backward component of the solution of the controlled FBSDEs alluded to earlier, on one hand, we can prove the value function is deterministic function with respect to the initial random variable; On the other hand, we can show that the value function is \emph{law-invariant}, i.e., depend on only via its distribution by virtue of BSDE property. Afterward, utilizing the notion of differentiability with respect to probability measures introduced by P.L. Lions \cite{Lions2012}, we are able to establish a DPP for the value function in the Wasserstein space of probability measures based on the application of BSDE approach, particularly, employing the notion of stochastic \emph{backward semigroups} associated with stochastic optimal control problems and It\^{o}'s formula along a flow property of the conditional law of the controlled forward state process. We prove that the value function is the unique viscosity solutions of the associated generalized HJB equations in some sparable Hilbert space. Finally, as an application, we formulate an optimal control problem for linear stochastic differential equations with quadratic cost functionals of McKean-Vlasov type under nonlinear expectation, $g$-expectation introduced by Peng \cite{Peng04} and derive the optimal feedback control explicitly by means of several groups of Riccati equations.
11.Incomplete Information Linear-Quadratic Mean-Field Games and Related Riccati Equations
Authors:Min Li, Tianyang Nie, Shunjun Wang, Ke Yan
Abstract: We study a class of linear-quadratic mean-field games with incomplete information. For each agent, the state is given by a linear forward stochastic differential equation with common noise. Moreover, both the state and control variables can enter the diffusion coefficients of the state equation. We deduce the open-loop adapted decentralized strategies and feedback decentralized strategies by mean-field forward-backward stochastic differential equation and Riccati equations, respectively. The well-posedness of the corresponding consistency condition system is obtained and the limiting state-average turns out to be the solution of a mean-field stochastic differential equation driven by common noise. We also verify the $\varepsilon$-Nash equilibrium property of the decentralized control strategies. Finally, a network security problem is studied to illustrate our results as an application.
12.Hoffman constant of the argmin mapping in linear optimization
Authors:J. Camacho, M. J. Cánovas, H. Gfrerer, J. Parra
Abstract: The main contribution of this paper consists of providing an explicit formula to compute the Hoffman constant of the argmin mapping in linear optimization. The work is developed in the context of right-hand side perturbations of the constraint system as the Hoffman constant is always infinite when we perturb the objective function coefficients, unless the left-hand side of the constraints reduces to zero. In our perturbation setting, the argmin mapping is a polyhedral mapping whose graph is the union of convex polyhedral sets which assemble in a so nice way that global measures of the stability (Hoffman constants) can be computed through semilocal and local ones (as Lipschitz upper semicontinuity and calmness moduli, whose computation has been developed in previous works). Indeed, we isolate this nice behavior of the graph in the concept of well-connected polyhedral mappings and, in a first step, the paper focuses on Hoffman constant for these multifunctions. When confined to the optimal set, some specifics on directional stability are also presented.
13.Synthesizing Control Laws from Data using Sum-of-Squares Optimization
Authors:Jason J. Bramburger, Steven Dahdah, James Richard Forbes
Abstract: The control Lyapunov function (CLF) approach to nonlinear control design is well established. Moreover, when the plant is control affine and polynomial, sum-of-squares (SOS) optimization can be used to find a polynomial controller as a solution to a semidefinite program. This letter considers the use of data-driven methods to design a polynomial controller by leveraging Koopman operator theory, CLFs, and SOS optimization. First, Extended Dynamic Mode Decomposition (EDMD) is used to approximate the Lie derivative of a given CLF candidate with polynomial lifting functions. Then, the polynomial Koopman model of the Lie derivative is used to synthesize a polynomial controller via SOS optimization. The result is a flexible data-driven method that skips the intermediary process of system identification and can be applied widely to control problems. The proposed approach is used to successfully synthesize a controller to stabilize an inverted pendulum on a cart.
14.Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm
Authors:Amrutha Varshini Ramesh, Aaron Mishkin, Mark Schmidt, Yihan Zhou, Jonathan Wilder Lavington, Jennifer She
Abstract: We consider minimizing a smooth function subject to a summation constraint over its variables. By exploiting a connection between the greedy 2-coordinate update for this problem and equality-constrained steepest descent in the 1-norm, we give a convergence rate for greedy selection under a proximal Polyak-Lojasiewicz assumption that is faster than random selection and independent of the problem dimension $n$. We then consider minimizing with both a summation constraint and bound constraints, as arises in the support vector machine dual problem. Existing greedy rules for this setting either guarantee trivial progress only or require $O(n^2)$ time to compute. We show that bound- and summation-constrained steepest descent in the L1-norm guarantees more progress per iteration than previous rules and can be computed in only $O(n \log n)$ time.
15.A numerical algorithm for attaining the Chebyshev bound in optimal learning
Authors:Pradyumna Paruchuri, Debasish Chatterjee
Abstract: Given a compact subset of a Banach space, the Chebyshev center problem consists of finding a minimal circumscribing ball containing the set. In this article we establish a numerically tractable algorithm for solving the Chebyshev center problem in the context of optimal learning from a finite set of data points. For a hypothesis space realized as a compact but not necessarily convex subset of a finite-dimensional subspace of some underlying Banach space, this algorithm computes the Chebyshev radius and the Chebyshev center of the hypothesis space, thereby solving the problem of optimal recovery of functions from data. The algorithm itself is based on, and significantly extends, recent results for near-optimal solutions of convex semi-infinite problems by means of targeted sampling, and it is of independent interest. Several examples of numerical computations of Chebyshev centers are included in order to illustrate the effectiveness of the algorithm.
16.A geometric framework for discrete time port-Hamiltonian systems
Authors:Karim Cherifi, Hannes Gernandt, Dorothea Hinsen, Volker Mehrmann
Abstract: Port-Hamiltonian systems provide an energy-based formulation with a model class that is closed under structure preserving interconnection. For continuous-time systems these interconnections are constructed by geometric objects called Dirac structures. In this paper, we derive this geometric formulation and the interconnection properties for scattering passive discrete-time port-Hamiltonian systems.
1.Calm local optimality for nonconvex-nonconcave minimax problems
Authors:Xiaoxiao Ma, Wei Yao, Jane J. Ye, Jin Zhang
Abstract: Nonconvex-nonconcave minimax problems have found numerous applications in various fields including machine learning. However, questions remain about what is a good surrogate for local minimax optimum and how to characterize the minimax optimality. Recently Jin, Netrapalli, and Jordan (ICML 2020) introduced a concept of local minimax point and derived optimality conditions for the smooth and unconstrained case. In this paper, we introduce the concept of calm local minimax point, which is a local minimax point with a calm radius function. With the extra calmness property we obtain first and second-order sufficient and necessary optimality conditions for a very general class of nonsmooth nonconvex-nonconcave minimax problem. Moreover we show that the calm local minimax optimality and the local minimax optimality coincide under a weak sufficient optimality condition for the maximization problem. This equivalence allows us to derive stronger optimality conditions under weaker assumptions for local minimax optimality.
2.Impulse control with generalised discounting
Authors:Damian Jelito, Łukasz Stettner
Abstract: In this paper, we investigate the effects of applying generalised (non-exponential) discounting on a long-run impulse control problem for a Feller-Markov process. We show that the optimal value of the discounted problem is the same as the optimal value of its undiscounted version. Next, we prove that an optimal strategy for the undiscounted discrete time functional is also optimal for the discrete-time discounted criterion and nearly optimal for the continuous-time discounted one. This shows that the discounted problem, being time-inconsistent in nature, admits a time-consistent solution. Also, instead of a complex time-dependent Bellman equation one may consider its simpler time-independent version.
3.An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue Optimization Problems
Authors:Clément Lezane, Cristóbal Guzmán, Alexandre d'Aspremont
Abstract: In this work, we revisit the problem of solving large-scale semidefinite programs using randomized first-order methods and stochastic smoothing. We introduce two oblivious stochastic mirror descent algorithms based on a complementary composite setting. One algorithm is designed for non-smooth objectives, while an accelerated version is tailored for smooth objectives. Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function. For the non-smooth case with $\mathcal{M}-$bounded oracles, we prove a convergence rate of $ O( {\mathcal{M}}/{\sqrt{T}} ) $. For the $L$-smooth case with a feasible set bounded by $D$, we derive a convergence rate of $ O( {L^2 D^2}/{(T^{2}\sqrt{T})} + {(D_0^2+\sigma^2)}/{\sqrt{T}} )$, where $D_0$ is the starting distance to an optimal solution, and $ \sigma^2$ is the stochastic oracle variance. These rates had only been obtained so far by either assuming prior knowledge of the Lipschitz constant or the starting distance to an optimal solution. We further show how to extend our framework to relative scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
4.Convergence property of the Quantized Distributed Gradient descent with constant stepsizes and an effective strategy for the stepsize selection
Authors:Woocheol Choi, Myeong-Su Lee
Abstract: In this paper, we establish new convergence results for the quantized distributed gradient descent and suggest a novel strategy of choosing the stepsizes for the high-performance of the algorithm. Under the strongly convexity assumption on the aggregate cost function and the smoothness assumption on each local cost function, we prove the algorithm converges exponentially fast to a small neighborhood of the optimizer whose radius depends on the stepsizes. Based on our convergence result, we suggest an effective selection of stepsizes which repeats diminishing the stepsizes after a number of specific iterations. Both the convergence results and the effectiveness of the suggested stepsize selection are also verified by the numerical experiments.
5.Homogeneous Second-Order Descent Framework: A Fast Alternative to Newton-Type Methods
Authors:Chang He, Yuntian Jiang, Chuwen Zhang, Dongdong Ge, Bo Jiang, Yinyu Ye
Abstract: This paper proposes a homogeneous second-order descent framework (HSODF) for nonconvex and convex optimization based on the generalized homogeneous model (GHM). In comparison to the Newton steps, the GHM can be solved by extremal symmetric eigenvalue procedures and thus grant an advantage in ill-conditioned problems. Moreover, GHM extends the ordinary homogeneous model (OHM) to allow adaptiveness in the construction of the aggregated matrix. Consequently, HSODF is able to recover some well-known second-order methods such as trust-region methods and gradient regularized methods while maintaining comparable iteration complexity bounds. We also study two specific realizations of HSODF. One is adptive HSODM, which has a parameter-free $O(\epsilon^{-3/2})$ global complexity bound for nonconvex second-order Lipschitz continuous functions. The other one is homotopy HSODM, which is proven to have a global linear rate of convergence without strong convexity. The efficiency of our appproach on ill-conditioned and high-dimensional problems are justified by some perlimiarny numerical results.
6.The risk-sensitive optimal stopping problem: geometric solution and algorithms
Authors:Tomasz Kosmala, John Moriarty
Abstract: We use the geometry of functions associated with martingales under a risk measure to solve risk-sensitive Markovian optimal stopping problems. Generalising the risk-neutral case due to Dynkin and Yushkievich (1969), the risk-sensitive value function is the pointwise infimum of those functions which dominate the gain function. The functions are not required to be differentiable, can explode to infinity, and form a three-dimensional set, and in the differentiable case the smooth fit principle holds. Only elementary properties of the driving Markov process $X$ are used. Algorithms are provided to construct the value function, with the computational cost of a two-dimensional search.
7.Convex quartic problems: homogenized gradient method and preconditioning
Authors:Radu-Alexandru Dragomir, Yurii Nesterov
Abstract: We consider a convex minimization problem for which the objective is the sum of a homogeneous polynomial of degree four and a linear term. Such task arises as a subproblem in algorithms for quadratic inverse problems with a difference-of-convex structure. We design a first-order method called Homogenized Gradient, along with an accelerated version, which enjoy fast convergence rates of respectively $\mathcal{O}(\kappa^2/K^2)$ and $\mathcal{O}(\kappa^2/K^4)$ in relative accuracy, where $K$ is the iteration counter. The constant $\kappa$ is the quartic condition number of the problem. Then, we show that for a certain class of problems, it is possible to compute a preconditioner for which this condition number is $\sqrt{n}$, where $n$ is the problem dimension. To establish this, we study the more general problem of finding the best quadratic approximation of an $\ell_p$ norm composed with a quadratic map. Our construction involves a generalization of the so-called Lewis weights.
8.Algorithms for Shipping Container Delivery Scheduling
Authors:Anna Collins, Dimitrios Letsios, Gueorgui Mihaylov
Abstract: Motivated by distribution problems arising in the supply chain of Haleon, we investigate a discrete optimization problem that we call the "container delivery scheduling problem". The problem models a supplier dispatching ordered products with shipping containers from manufacturing sites to distribution centers, where orders are collected by the buyers at agreed due times. The supplier may expedite or delay item deliveries to reduce transshipment costs at the price of increasing inventory costs, as measured by the number of containers and distribution center storage/backlog costs, respectively. The goal is to compute a delivery schedule attaining good trade-offs between the two. This container delivery scheduling problem is a temporal variant of classic bin packing problems, where the item sizes are not fixed, but depend on the item due times and delivery times. An approach for solving the problem should specify a batching policy for container consolidation and a scheduling policy for deciding when each container should be delivered. Based on the available item due times, we develop algorithms with sequential and nested batching policies as well as on-time and delay-tolerant scheduling policies. We elaborate on the problem's hardness and substantiate the proposed algorithms with positive and negative approximation bounds, including the derivation of an algorithm achieving an asymptotically tight 2-approximation ratio.
9.Accelerating Inexact HyperGradient Descent for Bilevel Optimization
Authors:Haikuo Yang, Luo Luo, Chris Junchi Li, Michael I. Jordan
Abstract: We present a method for solving general nonconvex-strongly-convex bilevel optimization problems. Our method -- the \emph{Restarted Accelerated HyperGradient Descent} (\texttt{RAHGD}) method -- finds an $\epsilon$-first-order stationary point of the objective with $\tilde{\mathcal{O}}(\kappa^{3.25}\epsilon^{-1.75})$ oracle complexity, where $\kappa$ is the condition number of the lower-level objective and $\epsilon$ is the desired accuracy. We also propose a perturbed variant of \texttt{RAHGD} for finding an $\big(\epsilon,\mathcal{O}(\kappa^{2.5}\sqrt{\epsilon}\,)\big)$-second-order stationary point within the same order of oracle complexity. Our results achieve the best-known theoretical guarantees for finding stationary points in bilevel optimization and also improve upon the existing upper complexity bound for finding second-order stationary points in nonconvex-strongly-concave minimax optimization problems, setting a new state-of-the-art benchmark. Empirical studies are conducted to validate the theoretical results in this paper.
10.Convex Optimization in Legged Robots
Authors:Prathamesh Saraf, Mustafa Shaikh, Myron Phan
Abstract: Convex optimization is crucial in controlling legged robots, where stability and optimal control are vital. Many control problems can be formulated as convex optimization problems, with a convex cost function and constraints capturing system dynamics. Our review focuses on active balancing problems and presents a general framework for formulating them as second-order cone programming (SOCP) for robustness and efficiency with existing interior point algorithms. We then discuss some prior work around the Zero Moment Point stability criterion, Linear Quadratic Regulator Control, and then the feedback model predictive control (MPC) approach to improve prediction accuracy and reduce computational costs. Finally, these techniques are applied to stabilize the robot for jumping and landing tasks. Further research in convex optimization of legged robots can have a significant societal impact. It can lead to improved gait planning and active balancing which enhances their ability to navigate complex environments, assist in search and rescue operations and perform tasks in hazardous environments. These advancements have the potential to revolutionize industries and help humans in daily life.
11.Optimal Control of Chromate Removal via Enhanced Modeling using the Method of Moments
Authors:Fred Ghanem, Kirti M. Yenkie
Abstract: Single-use anion-exchange resins can reduce hazardous chromates to safe levels in drinking water. However, since most process control strategies monitor effluent concentrations, detection of any chromate leakage leads to premature resin replacement. Furthermore, variations in the inlet chromate concentration and other process conditions make process control a challenging step. In this work, we capture the uncertainty of the process conditions by applying the Ito process of Brownian motion with drift into a stochastic optimal control strategy. The ion exchange process is modeled using the method of moments which helps capture the process dynamics, later formulated into mathematical objectives representing desired chromate removal. We then solved our developed models as an optimal control problem via Pontryagin's maximum principle. The objectives enabled a successful control via flow rate adjustments leading to higher chromate extraction. Such an approach maximized the capacity of the resin and column efficiency to remove toxic compounds from water while capturing deviations in the process conditions.
1.Moreau Envelope Based Difference-of-weakly-Convex Reformulation and Algorithm for Bilevel Programs
Authors:Lucy L. Gao, Jane J. Ye, Haian Yin, Shangzhi Zeng, Jin Zhang
Abstract: Recently, Ye et al. (Mathematical Programming 2023) designed an algorithm for solving a specific class of bilevel programs with an emphasis on applications related to hyperparameter selection, utilizing the difference of convex algorithm based on the value function approach reformulation. The proposed algorithm is particularly powerful when the lower level problem is fully convex , such as a support vector machine model or a least absolute shrinkage and selection operator model. In this paper, to suit more applications related to machine learning and statistics, we substantially weaken the underlying assumption from lower level full convexity to weak convexity. Accordingly, we propose a new reformulation using Moreau envelope of the lower level problem and demonstrate that this reformulation is a difference of weakly convex program. Subsequently, we develop a sequentially convergent algorithm for solving this difference of weakly convex program. To evaluate the effectiveness of our approach, we conduct numerical experiments on the bilevel hyperparameter selection problem from elastic net, sparse group lasso, and RBF kernel support vector machine models.
2.Sampling-Based Approaches for Multimarginal Optimal Transport Problems with Coulomb Cost
Authors:Yukuan Hu, Mengyu Li, Xin Liu, Cheng Meng
Abstract: The multimarginal optimal transport problem with Coulomb cost arises in quantum physics and is vital in understanding strongly correlated quantum systems. Its intrinsic curse of dimensionality can be overcome with a Monge-like ansatz. A nonconvex quadratic programmming then emerges after employing discretization and $\ell_1$ penalty. To globally solve this nonconvex problem, we adopt a grid refinements-based framework, in which a local solver is heavily invoked and hence significantly determines the overall efficiency. The block structure of this nonconvex problem suggests taking block coordinate descent-type methods as the local solvers, while the existing ones can get seriously afflicted with the poor scalability induced by the associated sparse-dense matrix multiplications. In this work, borrowing the tools from optimal transport, we develop novel methods that favor highly scalable schemes for subproblems and are completely free of the full matrix multiplications after introducing entrywise sampling. Convergence and asymptotic properties are built on the theory of random matrices. The numerical results on several typical physical systems corroborate the effectiveness and better scalability of our approach, which also allows the first visualization for the approximate optimal transport maps between electrons in three-dimensional contexts.
3.Approximate controllabillity of a 2D linear system related to the motion of two fluids with surface tension
Authors:Sebastien Court
Abstract: We consider a coupled system of partial differential equations describing the interactions between a closed free interface and two viscous incompressible fluids. The fluids are assumed to satisfy the incompressible Navier-Stokes equations in time-dependent domains that are determined by the free interface. The mean curvature of the interface induces a surface tension force that creates a jump of the Cauchy stress tensor on both sides. It influences the behavior of the surrounding fluids, and therefore the deformation of this interface via the equality of velocities. In dimension 2, the steady states correspond to immobile interfaces that are circles with all the same volume. Considering small displacements of steady states, we are lead to consider a linearized version of this system. We prove that the latter is approximately controllable to a given steady state for any time $T>0$ by the means of additional surface tension type forces, provided that the radius of the circle of reference does not coincide with a scaled zero of the Bessel function of first kind.
4.A Low-Power Hardware-Friendly Optimisation Algorithm With Absolute Numerical Stability and Convergence Guarantees
Authors:Anis Hamadouche, Yun Wu, Andrew M. Wallace, Joao F. C. Mota
Abstract: We propose Dual-Feedback Generalized Proximal Gradient Descent (DFGPGD) as a new, hardware-friendly, operator splitting algorithm. We then establish convergence guarantees under approximate computational errors and we derive theoretical criteria for the numerical stability of DFGPGD based on absolute stability of dynamical systems. We also propose a new generalized proximal ADMM that can be used to instantiate most of existing proximal-based composite optimization solvers. We implement DFGPGD and ADMM on FPGA ZCU106 board and compare them in light of FPGA's timing as well as resource utilization and power efficiency. We also perform a full-stack, application-to-hardware, comparison between approximate versions of DFGPGD and ADMM based on dynamic power/error rate trade-off, which is a new hardware-application combined metric.
5.A Counterexample to D. J. White's Theorem on a Vector-valued Extension of the Optimality Equations of a Markov Decision Process
Authors:Anas Mifrani
Abstract: It is well known that under the expected total reward criterion, the optimal value of a finite-horizon Markov decision process can be determined by solving a set of recursively defined equations backward in time. An extension of those equations to vector-valued processes was proposed by D. J. White in 1982. By means of a counterexample, we show that the assumptions underlying this extension are insufficient to guarantee its validity. A strong assumption on state dynamics is introduced to resolve this issue.
6.Improved Convergence Bounds For Operator Splitting Algorithms With Rare Extreme Errors
Authors:Anis Hamadouche, Andrew M. Wallace, Joao F. C. Mota
Abstract: In this paper, we improve upon our previous work[24,22] and establish convergence bounds on the objective function values of approximate proximal-gradient descent (AxPGD), approximate accelerated proximal-gradient descent (AxAPGD) and approximate proximal ADMM (AxWLM-ADMM) schemes. We consider approximation errors that manifest rare extreme events and we propagate their effects through iterations. We establish probabilistic asymptotic and non-asymptotic convergence bounds as functions of the range (upper/lower bounds) and variance of approximation errors. We use the derived bound to assess AxPGD in a sparse model predictive control of a spacecraft system and compare its accuracy with previously derived bounds.
7.Robust Time-inconsistent Linear-Quadratic Stochastic Controls: A Stochastic Differential Game Approach
Authors:Bingyan Han, Chi Seng Pun, Hoi Ying Wong
Abstract: This paper studies robust time-inconsistent (TIC) linear-quadratic stochastic control problems, formulated by stochastic differential games. By a spike variation approach, we derive sufficient conditions for achieving the Nash equilibrium, which corresponds to a time-consistent (TC) robust policy, under mild technical assumptions. To illustrate our framework, we consider two scenarios of robust mean-variance analysis, namely with state- and control-dependent ambiguity aversion. We find numerically that with time inconsistency haunting the dynamic optimal controls, the ambiguity aversion enhances the effective risk aversion faster than the linear, implying that the ambiguity in the TIC cases is more impactful than that under the TC counterparts, e.g., expected utility maximization problems.
8.Consistency of sample-based stationary points for infinite-dimensional stochastic optimization
Authors:Johannes Milz
Abstract: We consider stochastic optimization problems with possibly nonsmooth integrands posed in Banach spaces and approximate these stochastic programs via a sample-based approaches. We establish the consistency of approximate Clarke stationary points of the sample-based approximations. Our framework is applied to risk-averse semilinear PDE-constrained optimization using the average value-at-risk and to risk-neutral bilinear PDE-constrained optimization.
9.Pupil-driven quantitative differential phase contrast imaging
Authors:Shuhe Zhang, Hao Wu, Tao Peng, Zeyu Ke, Meng Shao, Tos T. J. M. Berendschot, Jinhua Zhou
Abstract: In this research, we reveal the inborn but hitherto ignored properties of quantitative differential phase contrast (qDPC) imaging: the phase transfer function being an edge detection filter. Inspired by this, we highlighted the duality of qDPC between optics and pattern recognition, and propose a simple and effective qDPC reconstruction algorithm, termed Pupil-Driven qDPC (pd-qDPC), to facilitate the phase reconstruction quality for the family of qDPC-based phase reconstruction algorithms. We formed a new cost function in which modified L0-norm was used to represent the pupil-driven edge sparsity, and the qDPC convolution operator is duplicated in the data fidelity term to achieve automatic background removal. Further, we developed the iterative reweighted soft-threshold algorithms based on split Bregman method to solve this modified L0-norm problem. We tested pd-qDPC on both simulated and experimental data and compare against state-of-the-art (SOTA) methods including L2-norm, total variation regularization (TV-qDPC), isotropic-qDPC, and Retinex qDPC algorithms. Results show that our proposed model is superior in terms of phase reconstruction quality and implementation efficiency, in which it significantly increases the experimental robustness while maintaining the data fidelity. In general, the pd-qDPC enables the high-quality qDPC reconstruction without any modification of the optical system. It simplifies the system complexity and benefits the qDPC community and beyond including but not limited to cell segmentation and PTF learning based on the edge filtering property.
10.PANTR: A proximal algorithm with trust-region updates for nonconvex constrained optimization
Authors:Alexander Bodard, Pieter Pas, Panagiotis Patrinos
Abstract: This work presents PANTR, an efficient solver for nonconvex constrained optimization problems, that is well-suited as an inner solver for an augmented Lagrangian method. The proposed scheme combines forward-backward iterations with solutions to trust-region subproblems: the former ensures global convergence, whereas the latter enables fast update directions. We discuss how the algorithm is able to exploit exact Hessian information of the smooth objective term through a linear Newton approximation, while benefiting from the structure of box-constraints or l1-regularization. An open-source C++ implementation of PANTR is made available as part of the NLP solver library ALPAQA. Finally, the effectiveness of the proposed method is demonstrated in nonlinear model predictive control applications.
11.The Boosted Double-Proximal Subgradient Algorithm for Nonconvex Optimization
Authors:Francisco J. Aragón-Artacho, Pedro Pérez-Aros, David Torregrosa-Belén
Abstract: In this paper we introduce the Boosted Double-proximal Subgradient Algorithm (BDSA), a novel splitting algorithm designed to address general structured nonsmooth and nonconvex mathematical programs expressed as sums and differences of composite functions. BDSA exploits the combined nature of subgradients from the data and proximal steps, and integrates a line-search procedure to enhance its performance. While BDSA encompasses existing schemes proposed in the literature, it extends its applicability to more diverse problem domains. We establish the convergence of BDSA under the Kurdyka--Lojasiewicz property and provide an analysis of its convergence rate. To evaluate the effectiveness of BDSA, we introduce a novel family of challenging test functions with an abundance of critical points. We conduct comparative evaluations demonstrating its ability to effectively escape non-optimal critical points. Additionally, we present two practical applications of BDSA for testing its efficacy, namely, a constrained minimum-sum-of-squares clustering problem and a nonconvex generalization of Heron's problem.
1.Stochastic Trip Planning in High Dimensional Public Transit Network
Authors:Raashid Altaf, Pravesh Biyani
Abstract: This paper proposes a generalised framework for density estimation in large networks with measurable spatiotemporal variance in edge weights. We solve the stochastic shortest path problem for a large network by estimating the density of the edge weights in the network and analytically finding the distribution of a path. In this study, we employ Gaussian Processes to model the edge weights. This approach not only reduces the analytical complexity associated with computing the stochastic shortest path but also yields satisfactory performance. We also provide an online version of the model that yields a 30 times speedup in the algorithm's runtime while retaining equivalent performance. As an application of the model, we design a real-time trip planning system to find the stochastic shortest path between locations in the public transit network of Delhi. Our observations show that different paths have different likelihoods of being the shortest path at any given time in a public transit network. We demonstrate that choosing the stochastic shortest path over a deterministic shortest path leads to savings in travel time of up to 40\%. Thus, our model takes a significant step towards creating a reliable trip planner and increase the confidence of the general public in developing countries to take up public transit as a primary mode of transportation.
2.Auction algorithm sensitivity for multi-robot task allocation
Authors:Katie Clinch, Tony A. Wood, Chris Manzie
Abstract: We consider the problem of finding a low-cost allocation and ordering of tasks between a team of robots in a d-dimensional, uncertain, landscape, and the sensitivity of this solution to changes in the cost function. Various algorithms have been shown to give a 2-approximation to the MinSum allocation problem. By analysing such an auction algorithm, we obtain intervals on each cost, such that any fluctuation of the costs within these intervals will result in the auction algorithm outputting the same solution.
3.Guarantees for data-driven control of nonlinear systems using semidefinite programming: A survey
Authors:Tim Martin, Thomas B. Schön, Frank Allgöwer
Abstract: This survey presents recent research on determining control-theoretic properties and designing controllers with rigorous guarantees and for nonlinear systems for which no mathematical models but measured trajectories are available. Data-driven control techniques have been developed to circumvent a time-consuming modelling by first principles and because of the increasing availability of data. Recently, this research field has gained increased attention by the application of Willems' fundamental lemma, which provides a fertile ground for the development of data-driven control schemes with guarantees for linear time-invariant systems. While the fundamental lemma can be generalized to further system classes, there does not exist a comparable comprising theory for nonlinear systems. At the same time, nonlinear systems constitute the majority of practical systems. Moreover, they include additional challenges such as nonconvex optimization and data-based surrogate models that prevent end-to-end guarantees. Therefore, a variety of data-driven control approaches has been developed with different required prior insights into the system to ensure a guaranteed inference. In this survey, we will discuss developments in the context of data-driven control for nonlinear systems. In particular, we will focus on approaches providing guarantees from finite data, while the analysis and the controller design are computationally tractable due to semidefinite programming. Thus, these approaches achieve reasonable advances compared to the state-of-the-art system analysis and controller design by models from system identification.
4.Forward-backward algorithm for functions with locally Lipschitz gradient: applications to mean field games
Authors:Luis M. Briceno-Arias XLIM, Francisco José Silva XLIM, Xianjin Yang CALTECH
Abstract: In this paper, we provide a generalization of the forward-backward splitting algorithm for minimizing the sum of a proper convex lower semicontinuous function and a differentiable convex function whose gradient satisfies a locally Lipschitztype condition. We prove the convergence of our method and derive a linear convergence rate when the differentiable function is locally strongly convex. We recover classical results in the case when the gradient of the differentiable function is globally Lipschitz continuous and an already known linear convergence rate when the function is globally strongly convex. We apply the algorithm to approximate equilibria of variational mean field game systems with local couplings. Compared with some benchmark algorithms to solve these problems, our numerical tests show similar performances in terms of the number of iterations but an important gain in the required computational time.
5.An optimal hierarchical control scheme for smart generation units: an application to combined steam and electricity generation
Authors:Stefano Spinelli, Marcello Farina, Andrea Ballarino
Abstract: Optimal management of thermal and energy grids with fluctuating demand and prices requires to orchestrate the generation units (GU) among all their operating modes. A hierarchical approach is proposed to control coupled energy nonlinear systems. The high level hybrid optimization defines the unit commitment, with the optimal transition strategy, and best production profiles. The low level dynamic model predictive control (MPC), receiving the set-points from the upper layer, safely governs the systems considering process constraints. To enhance the overall efficiency of the system, a method to optimal start-up the GU is here presented: a linear parameter varying MPC computes the optimal trajectory in closed-loop by iteratively linearising the system along the previous optimal solution. The introduction of an intermediate equilibrium state as additional decision variable permits the reduction of the optimization horizon,while a terminal cost term steers the system to the target set-point. Simulation results show the effectiveness of the proposed approach.
6.Alternating minimization for simultaneous estimation of a latent variable and identification of a linear continuous-time dynamic system
Authors:Pierre-Cyril Aubin-Frankowski, Alain Bensoussan, S. Joe Qin
Abstract: We propose an optimization formulation for the simultaneous estimation of a latent variable and the identification of a linear continuous-time dynamic system, given a single input-output pair. We justify this approach based on Bayesian maximum a posteriori estimators. Our scheme takes the form of a convex alternating minimization, over the trajectories and the dynamic model respectively. We prove its convergence to a local minimum which verifies a two point-boundary problem for the (latent) state variable and a tensor product expression for the optimal dynamics.
7.Equal area partitions of the sphere with diameter bounds, via optimal transport
Authors:Jun Kitagawa, Asuka Takatsu
Abstract: We prove existence of equal area partitions of the unit sphere via optimal transport methods, accompanied by diameter bounds written in terms of Monge--Kantorovich distances. This can be used to obtain bounds on the expectation of the maximum diameter of partition sets, when points are uniformly sampled from the sphere. An application to the computation of sliced Monge--Kantorovich distances is also presented.
8.Theory and applications of the Sum-Of-Squares technique
Authors:Francis Bach, Elisabetta Cornacchia, Luca Pesce, Giovanni Piccioli
Abstract: The Sum-of-Squares (SOS) approximation method is a technique used in optimization problems to derive lower bounds to the optimal value of an objective function. By representing the objective function as a sum of squares in a feature space, the SOS method transforms non-convex global optimization problems into solvable semidefinite programs. This note presents an overview of the SOS method. We start with its application in finite-dimensional feature spaces and, subsequently, we extend it to infinite-dimensional feature spaces using kernels (k-SOS). Additionally, we highlight the utilization of SOS for estimating some relevant quantities in information theory, including the log-partition function.
9.The interdependence between hospital choice and waiting time -- with a case study in urban China
Authors:Joris van de Klundert, Roberto Cominetti, Yun Liu, Qingxia Kong
Abstract: Hospital choice models often employ random utility theory and include waiting time as a choice determinant. When applied to evaluate health system improvement interventions, these models disregard that hospital choice in turn is a determinant of waiting time. We present a novel, general model capturing the endogeneous relationship between waiting time and hospital choice, including the choice to opt out, and characterize the unique equilibrium solution of the resulting convex problem. We apply the general model in a case study on the urban Chinese health system, specifying that patient choice follows a multinomial logit (MNL) model and waiting times are determined by M/M/1 queues. The results reveal that analyses which solely rely on MNL models overestimate the effectiveness of present policy interventions and that this effectiveness is limited. We explore alternative, more effective, improvement interventions.
10.An optimization approach to study the phase changing behavior of multi-component mixtures
Authors:Gustavo E. O. Celis, Reza Arefidamghani, Hamidreza Anbarlooei, Daniel O. A. Cruz
Abstract: The appropriate design, construction, and operation of carbon capture and storage (CCS) and enhanced oil recovery (EOR) processes require a deep understanding of the resulting phases behavior in hydrocarbons-CO_2 multi-component mixtures under reservoir conditions. To model this behavior a nonlinear system consists of the equation of states and some mixing rules (for each component) needed to be solved simultaneously. The mixing usually requires to model the binary interaction between the components of the mixture. This work employs optimization techniques to enhance the predictions of such model by optimizing the binary interaction parameters. The results show that the optimized parameters, although obtained mathematically, are in physical ranges and can reproduce successfully the experimental observations, specially for the multi-component hydrocarbons systems containing Carbon dioxide at reservoir temperatures and pressures
11.Upper bounds on maximum admissible noise in zeroth-order optimisation
Authors:Dmitry A. Pasechnyuk, Aleksandr Lobanov, Alexander Gasnikov
Abstract: In this paper, based on information-theoretic upper bound on noise in convex Lipschitz continuous zeroth-order optimisation, we provide corresponding upper bounds for strongly-convex and smooth classes of problems using non-constructive proofs through optimal reductions. Also, we show that based on one-dimensional grid-search optimisation algorithm one can construct algorithm for simplex-constrained optimisation with upper bound on noise better than that for ball-constrained and asymptotic in dimensionality case.
1.Automating Steady and Unsteady Adjoints: Efficiently Utilizing Implicit and Algorithmic Differentiation
Authors:Andrew Ning, Taylor McDonnell
Abstract: Algorithmic differentiation (AD) has become increasingly capable and straightforward to use. However, AD is inefficient when applied directly to solvers, a feature of most engineering analyses. We can leverage implicit differentiation to define a general AD rule, making adjoints automatic. Furthermore, we can leverage the structure of differential equations to automate unsteady adjoints in a memory efficient way. We also derive a technique to speed up explicit differential equation solvers, which have no iterative solver to exploit. All of these techniques are demonstrated on problems of various sizes, showing order of magnitude speed-ups with minimal code changes. Thus, we can enable users to easily compute accurate derivatives across complex analyses with internal solvers, or in other words, automate adjoints using a combination of AD and implicit differentiation.
2.Parameterized Complexity of Chordal Conversion for Sparse Semidefinite Programs with Small Treewidth
Authors:Richard Y. Zhang
Abstract: If a sparse semidefinite program (SDP), specified over $n\times n$ matrices and subject to $m$ linear constraints, has an aggregate sparsity graph $G$ with small treewidth, then chordal conversion will frequently allow an interior-point method to solve the SDP in just $O(m+n)$ time per-iteration. This is a significant reduction over the minimum $\Omega(n^{3})$ time per-iteration for a direct solution, but a definitive theoretical explanation was previously unknown. Contrary to popular belief, the speedup is not guaranteed by a small treewidth in $G$, as a diagonal SDP would have treewidth zero but can still necessitate up to $\Omega(n^{3})$ time per-iteration. Instead, we construct an extended aggregate sparsity graph $\overline{G}\supseteq G$ by forcing each constraint matrix $A_{i}$ to be its own clique in $G$. We prove that a small treewidth in $\overline{G}$ does indeed guarantee that chordal conversion will solve the SDP in $O(m+n)$ time per-iteration, to $\epsilon$-accuracy in at most $O(\sqrt{m+n}\log(1/\epsilon))$ iterations. For classical SDPs like the MAX-$k$-CUT relaxation and the Lovasz Theta problem, the two sparsity graphs coincide $G=\overline{G}$, so our result provide a complete characterization for the complexity of chordal conversion, showing that a small treewidth is both necessary and sufficient for $O(m+n)$ time per-iteration. Real-world SDPs like the AC optimal power flow relaxation have different graphs $G\subseteq\overline{G}$ with similar small treewidths; while chordal conversion is already widely used on a heuristic basis, in this paper we provide the first rigorous guarantee that it solves such SDPs in $O(m+n)$ time per-iteration. [Supporting code at https://github.com/ryz-codes/chordalConv/]
3.Topology optimization of transient vibroacoustic problems for broadband filter design using cut elements
Authors:Cetin B. Dilgen, Niels Aage
Abstract: The focus of this article is on shape and topology optimization of transient vibroacoustic problems. The main contribution is a transient problem formulation that enables optimization over wide ranges of frequencies with complex signals, which are often of interest in industry. The work employs time domain methods to realize wide band optimization in the frequency domain. To this end, the objective function is defined in frequency domain where the frequency response of the system is obtained through a fast Fourier transform (FFT) algorithm on the transient response of the system. The work utilizes a parametric level set approach to implicitly define the geometry in which the zero level describes the interface between acoustic and structural domains. A cut element method is used to capture the geometry on a fixed background mesh through utilization of a special integration scheme that accurately resolves the interface. This allows for accurate solutions to strongly coupled vibroacoustic systems without having to re-mesh at each design update. The present work relies on efficient gradient based optimizers where the discrete adjoint method is used to calculate the sensitivities of objective and constraint functions. A thorough explanation of the consistent sensitivity calculation is given involving the FFT operation needed to define the objective function in frequency domain. Finally, the developed framework is applied to various vibroacoustic filter designs and the optimization results are verified using commercial finite element software with a steady state time-harmonic formulation.
4.Convergence aspects for sets of measures with divergences and boundary conditions
Authors:Nicholas Chisholm, Carlos N. Rautenberg
Abstract: In this paper we study set convergence aspects for Banach spaces of vector-valued measures with divergences (represented by measures or by functions) and applications. We consider a form of normal trace characterization to establish subspaces of measures that directionally vanish in parts of the boundary, and present examples constructed with binary trees. Subsequently we study convex sets with total variation bounds and their convergence properties together with applications to the stability of optimization problems.
5.Quality Control in Particle Precipitation via Robust Optimization
Authors:Martina Kuchlbauer, Jana Dienstbier, Adeel Muneer, Hanna Hedges, Michael Stingl, Frauke Liers, Lukas Pflug
Abstract: In this work, we propose a robust optimization approach to mitigate the impact of uncertainties in particle precipitation. Our model incorporates partial differential equations, more particular nonlinear and nonlocal population balance equations to describe particle synthesis. The goal of the optimization problem is to design products with desired size distributions. Recognizing the impact of uncertainties, we extend the model to hedge against them. We emphasize the importance of robust protection to ensure the production of high-quality particles. To solve the resulting robust problem, we enhance a novel adaptive bundle framework for nonlinear robust optimization that integrates the exact method of moments approach for solving the population balance equations. Computational experiments performed with the integrated algorithm focus on uncertainties in the total mass of the system as it greatly influence the quality of the resulting product. Using realistic parameter values for quantum dot synthesis, we demonstrate the efficiency of our integrated algorithm. Furthermore, we find that the unprotected process fails to achieve the desired particle characteristics, even for small uncertainties, which highlights the necessity of the robust process. The latter consistently outperforms the unprotected process in quality of the obtained product, in particular in perturbed scenarios.
6.Limited-Memory Greedy Quasi-Newton Method with Non-asymptotic Superlinear Convergence Rate
Authors:Zhan Gao, Aryan Mokhtari, Alec Koppel
Abstract: Non-asymptotic convergence analysis of quasi-Newton methods has gained attention with a landmark result establishing an explicit superlinear rate of O$((1/\sqrt{t})^t)$. The methods that obtain this rate, however, exhibit a well-known drawback: they require the storage of the previous Hessian approximation matrix or instead storing all past curvature information to form the current Hessian inverse approximation. Limited-memory variants of quasi-Newton methods such as the celebrated L-BFGS alleviate this issue by leveraging a limited window of past curvature information to construct the Hessian inverse approximation. As a result, their per iteration complexity and storage requirement is O$(\tau d)$ where $\tau \le d$ is the size of the window and $d$ is the problem dimension reducing the O$(d^2)$ computational cost and memory requirement of standard quasi-Newton methods. However, to the best of our knowledge, there is no result showing a non-asymptotic superlinear convergence rate for any limited-memory quasi-Newton method. In this work, we close this gap by presenting a limited-memory greedy BFGS (LG-BFGS) method that achieves an explicit non-asymptotic superlinear rate. We incorporate displacement aggregation, i.e., decorrelating projection, in post-processing gradient variations, together with a basis vector selection scheme on variable variations, which greedily maximizes a progress measure of the Hessian estimate to the true Hessian. Their combination allows past curvature information to remain in a sparse subspace while yielding a valid representation of the full history. Interestingly, our established non-asymptotic superlinear convergence rate demonstrates a trade-off between the convergence speed and memory requirement, which to our knowledge, is the first of its kind. Numerical results corroborate our theoretical findings and demonstrate the effectiveness of our method.
7.Demand-side management via optimal production scheduling in power-intensive industries: The case of metal casting process
Authors:Danial Ramin, Stefano Spinelli, Alessandro Brusaferri
Abstract: The increasing challenges to the grid stability posed by the penetration of renewable energy resources urge a more active role for demand response programs as viable alternatives to a further expansion of peak power generators. This work presents a methodology to exploit the demand flexibility of energy-intensive industries under Demand-Side Management programs in the energy and reserve markets. To this end, we propose a novel scheduling model for a multi-stage multi-line process, which incorporates both the critical manufacturing constraints and the technical requirements imposed by the market. Using mixed integer programming approach, two optimization problems are formulated to sequentially minimize the cost in a day-ahead energy market and maximize the reserve provision when participating in the ancillary market. The effectiveness of day-ahead scheduling model has been verified for the case of a real metal casting plant in the Nordic market, where a significant reduction of energy cost is obtained. Furthermore, the reserve provision is shown to be a potential tool for capitalizing on the reserve market as a secondary revenue stream.
1.Open-loop and closed-loop solvabilities for discrete-time mean-field stochastic linear quadratic optimal control problems
Authors:Teng Song, Bin Liu
Abstract: This paper discusses the discrete-time mean-field stochastic linear quadratic optimal control problems, whose weighting matrices in the cost functional are not assumed to be definite. The open-loop solvability is characterized by the existence of the solution to a mean-field forward-backward stochastic difference equations with a convexity condition and a stationary condition. The closed-loop solvability is presented by virtue of the existences of the regular solution to the generalized Riccati equations and the solution to the linear recursive equation, which is also shown by the uniform convexity of the cost functional. Moreover, based on a family of uniformly convex cost functionals, the finiteness of the problem is characterized. Also, it turns out that a minimizing sequence, whose convergence is equivalent to the open-loop solvability of the problem. Finally, some examples are given to illustrate the theory developed.
2.Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning
Authors:Kuangyu Ding, Jingyang Li, Kim-Chuan Toh
Abstract: The widely used stochastic gradient methods for minimizing nonconvex composite objective functions require the Lipschitz smoothness of the differentiable part. But the requirement does not hold true for problem classes including quadratic inverse problems and training neural networks. To address this issue, we investigate a family of stochastic Bregman proximal gradient (SBPG) methods, which only require smooth adaptivity of the differentiable part. SBPG replaces the upper quadratic approximation used in SGD with the Bregman proximity measure, resulting in a better approximation model that captures the non-Lipschitz gradients of the nonconvex objective. We formulate the vanilla SBPG and establish its convergence properties under nonconvex setting without finite-sum structure. Experimental results on quadratic inverse problems testify the robustness of SBPG. Moreover, we propose a momentum-based version of SBPG (MSBPG) and prove it has improved convergence properties. We apply MSBPG to the training of deep neural networks with a polynomial kernel function, which ensures the smooth adaptivity of the loss function. Experimental results on representative benchmarks demonstrate the effectiveness and robustness of MSBPG in training neural networks. Since the additional computation cost of MSBPG compared with SGD is negligible in large-scale optimization, MSBPG can potentially be employed an universal open-source optimizer in the future.
3.The Implicit Rigid Tube Model Predictive Control
Authors:Saša V. Raković
Abstract: A computationally efficient reformulation of the rigid tube model predictive control is developed. A unique feature of the derived formulation is the utilization of the implicit set representations. This novel formulation does not require any set algebraic operations to be performed explicitly, and its implementation requires merely the use of the standard optimization solvers.
4.Optimal control of a parabolic equation with a nonlocal nonlinearity
Authors:Cyrille Kenne, Landry Djomegne, Gisèle Mophou
Abstract: This paper proposes an optimal control problem for a parabolic equation with a nonlocal nonlinearity. The system is described by a parabolic equation involving a nonlinear term that depends on the solution and its integral over the domain. We prove the existence and uniqueness of the solution to the system and the boundedness of the solution. Regularity results for the control-to-state operator, the cost functional and the adjoint state are also proved. We show the existence of optimal solutions and derive first-order necessary optimality conditions. In addition, second-order necessary and sufficient conditions for optimality are established.
5.Stability of optimal shapes and convergence of thresholding algorithms in linear and spectral optimal control problems
Authors:Antonin Chambolle, Idriss Mazari-Fouquer, Yannick Privat
Abstract: We prove the convergence of the fixed-point (also called thresholding) algorithm in three optimal control problems under large volume constraints. This algorithm was introduced by C\'ea, Gioan and Michel, and is of constant use in the simulation of $L^\infty-L^1$ optimal control problems. In this paper we consider the optimisation of the Dirichlet energy, of Dirichlet eigenvalues and of certain non-energetic problems. Our proofs rely on new diagonalisation procedure for shape hessians in optimal control problems, which leads to local stability estimates.
6.Sum-of-squares relaxations for polynomial min-max problems over simple sets
Authors:Francis Bach SIERRA
Abstract: We consider min-max optimization problems for polynomial functions, where a multivariate polynomial is maximized with respect to a subset of variables, and the resulting maximal value is minimized with respect to the remaining variables. When the variables belong to simple sets (e.g., a hypercube, the Euclidean hypersphere, or a ball), we derive a sum-of-squares formulation based on a primal-dual approach. In the simplest setting, we provide a convergence proof when the degree of the relaxation tends to infinity and observe empirically that it can be finitely convergent in several situations. Moreover, our formulation leads to an interesting link with feasibility certificates for polynomial inequalities based on Putinar's Positivstellensatz.
7.Generalized Scaling for the Constrained Maximum-Entropy Sampling Problem
Authors:Zhongzhu Chen, Marcia Fampa, Jon Lee
Abstract: The best practical techniques for exact solution of instances of the constrained maximum-entropy sampling problem, a discrete-optimization problem arising in the design of experiments, are via a branch-and-bound framework, working with a variety of concave continuous relaxations of the objective function. A standard and computationally-important bound-enhancement technique in this context is (ordinary) scaling, via a single positive parameter. Scaling adjusts the shape of continuous relaxations to reduce the gaps between the upper bounds and the optimal value. We extend this technique to generalized scaling, employing a positive vector of parameters, which allows much more flexibility and thus significantly reduces the gaps further. We give mathematical results aimed at supporting algorithmic methods for computing optimal generalized scalings, and we give computational results demonstrating the performance of generalized scaling on benchmark problem instances.
8.Gain Confidence, Reduce Disappointment: A New Approach to Cross-Validation for Sparse Regression
Authors:Ryan Cory-Wright, Andrés Gómez
Abstract: Ridge regularized sparse regression involves selecting a subset of features that explains the relationship between a design matrix and an output vector in an interpretable manner. To select the sparsity and robustness of linear regressors, techniques like leave-one-out cross-validation are commonly used for hyperparameter tuning. However, cross-validation typically increases the cost of sparse regression by several orders of magnitude. Additionally, validation metrics are noisy estimators of the test-set error, with different hyperparameter combinations giving models with different amounts of noise. Therefore, optimizing over these metrics is vulnerable to out-of-sample disappointment, especially in underdetermined settings. To address this, we make two contributions. First, we leverage the generalization theory literature to propose confidence-adjusted variants of leave-one-out that display less propensity to out-of-sample disappointment. Second, we leverage ideas from the mixed-integer literature to obtain computationally tractable relaxations of confidence-adjusted leave-one-out, thereby minimizing it without solving as many MIOs. Our relaxations give rise to an efficient coordinate descent scheme which allows us to obtain significantly lower leave-one-out errors than via other methods in the literature. We validate our theory by demonstrating we obtain significantly sparser and comparably accurate solutions than via popular methods like GLMNet and suffer from less out-of-sample disappointment. On synthetic datasets, our confidence adjustment procedure generates significantly fewer false discoveries, and improves out-of-sample performance by 2-5% compared to cross-validating without confidence adjustment. Across a suite of 13 real datasets, a calibrated version of our procedure improves the test set error by an average of 4% compared to cross-validating without confidence adjustment.
9.Near-Optimal Fully First-Order Algorithms for Finding Stationary Points in Bilevel Optimization
Authors:Lesi Chen, Yaohua Ma, Jingzhao Zhang
Abstract: Bilevel optimization has various applications such as hyper-parameter optimization and meta-learning. Designing theoretically efficient algorithms for bilevel optimization is more challenging than standard optimization because the lower-level problem defines the feasibility set implicitly via another optimization problem. One tractable case is when the lower-level problem permits strong convexity. Recent works show that second-order methods can provably converge to an $\epsilon$-first-order stationary point of the problem at a rate of $\tilde{\mathcal{O}}(\epsilon^{-2})$, yet these algorithms require a Hessian-vector product oracle. Kwon et al. (2023) resolved the problem by proposing a first-order method that can achieve the same goal at a slower rate of $\tilde{\mathcal{O}}(\epsilon^{-3})$. In this work, we provide an improved analysis demonstrating that the first-order method can also find an $\epsilon$-first-order stationary point within $\tilde {\mathcal{O}}(\epsilon^{-2})$ oracle complexity, which matches the upper bounds for second-order methods in the dependency on $\epsilon$. Our analysis further leads to simple first-order algorithms that can achieve similar near-optimal rates in finding second-order stationary points and in distributed bilevel problems.
1.An Approximate Projection onto the Tangent Cone to the Variety of Third-Order Tensors of Bounded Tensor-Train Rank
Authors:Charlotte Vermeylen, Guillaume Olikier, Marc Van Barel
Abstract: An approximate projection onto the tangent cone to the variety of third-order tensors of bounded tensor-train rank is proposed and proven to satisfy a better angle condition than the one proposed by Kutschan (2019). Such an approximate projection enables, e.g., to compute gradient-related directions in the tangent cone, as required by algorithms aiming at minimizing a continuously differentiable function on the variety, a problem appearing notably in tensor completion. A numerical experiment is presented which indicates that, in practice, the angle condition satisfied by the proposed approximate projection is better than both the one satisfied by the approximate projection introduced by Kutschan and the proven theoretical bound.
2.Solving the Train Dispatching Problem in Large Networks by Column Generation
Authors:Maik Schälicke, Karl Nachtigall
Abstract: Disruptions in the operational flow of rail traffic can lead to conflicts between train movements, such that a scheduled timetable can no longer be realised. This is where dispatching is applied, existing conflicts are resolved and a dispatching timetable is provided. In the process, train paths are varied in their spatio-temporal course. This is called the train dispatching problem (TDP), which consists of selecting conflict-free train paths with minimum delay. Starting from a path-oriented formulation of the TDP, a binary linear decision model is introduced. For each possible train path, a binary decision variable indicates whether the train path is used by the request or not. Such a train path is constructed from a set of predefined path parts (\profiles{}) within a time-space network. Instead of modelling pairwise conflicts, stronger MIP formulation are achieved by a clique formulation. The combinatorics of speed profiles and different departure times results in a large number of possible train paths, so that the column generation method is used here. New train paths within the pricing-problem can be calculated using shortest path techniques. Here, the shadow prices of conflict cliques must be taken into account. When constructing a new train path, it must be determined whether this train path belongs to a clique or not. This problem is tackled by a MIP. The methodology is tested on practical size instances from a dispatching area in Germany. Numerical results show that the presented method achieves acceptable computation times with good solution quality while meeting the requirements for real-time dispatching.
3.Computational investigations of a two-class traffic flow model: mean-field and microscopic dynamics
Authors:Abderrahmane Habbal, Imad Kissami, Amal Machtalay, Ahmed Ratnani
Abstract: We address a multi-class traffic model, for which we computationally assess the ability of mean-field games (MFG) to yield approximate Nash equilibria for traffic flow games of intractable large finite-players. We introduce a two-class traffic framework, following and extending the single-class lines of \cite{huang_game-theoretic_2020}. We extend the numerical methodologies, with recourse to techniques such as HPC and regularization of LGMRES solvers. The developed apparatus allows us to perform simulations at significantly larger space and time discretization scales. For three generic scenarios of cars and trucks, and three cost functionals, we provide numerous numerical results related to the autonomous vehicles (AVs) traffic dynamics, which corroborate for the multi-class case the effectiveness of the approach emphasized in \cite{huang_game-theoretic_2020}. We additionally provide several original comparisons of macroscopic Nash mean-field speeds with their microscopic versions, allowing us to computationally validate the so-called $\epsilon-$Nash approximation, with a rate slightly better than theoretically expected.
4.Synchronous dynamic game on system observability considering one or two steps optimality
Authors:Yueyue Xu, Xiaoming Hu, Lin Wang
Abstract: This paper studies a system security problem in the context of observability based on a two-party non-cooperative asynchronous dynamic game. A system is assumed to be secure if it is not observable. Both the defender and the attacker have means to modify dimension of the unobservable subspace, which is set as the value function. Utilizing tools from geometric control, we construct the best response set under one-step or two-step optimality to minimize or maximize the value function. We find that the best response sets under one-step optimality are not single-valued maps, resulting in a variety of game outcomes. In the dynamic game considering two-step optimality, definition and existence conditions of lock and oscillation game modes are given. Finally, the best response under two-step optimality and the Stackelberg game equilibrium are compared.
5.Fast Approximation of Unbalanced Optimal Transport and Maximum Mean Discrepancies
Authors:Rajmadan Lakshmanan, Alois Pichler
Abstract: This contribution presents significant computational accelerations to prominent schemes, which enable the comparison of measures, even with varying masses. Concisely, we employ nonequispaced fast Fourier transform to accelerate the radial kernel convolution in unbalanced optimal transport approximation, building on the Sinkhorn algorithm. Accelerated schemes are presented as well for the maximum mean discrepancies involving kernels based on distances. By employing nonequispaced fast Fourier transform, our approaches significantly reduce the arithmetic operations to compute the distances from $\mathcal O(n^2)$ to $\mathcal O(n\log n)$, which enables access to large and high-dimensional data sets. Furthermore, we show some robust relation between the Wasserstein distance and maximum mean discrepancies. Numerical experiments using synthetic data and real datasets demonstrate the computational acceleration and numerical precision.
6.Optimal Sensor Placement with Adaptive Constraints for Nuclear Digital Twins
Authors:Niharika Karnik, Mohammad G. Abdo, Carlos E. Estrada Perez, Jun Soo Yoo, Joshua J. Cogliati, Richard S. Skifton, Pattrick Calderoni, Steven L. Brunton, Krithika Manohar
Abstract: Given harsh operating conditions and physical constraints in reactors, nuclear applications cannot afford to equip the physical asset with a large array of sensors. Therefore, it is crucial to carefully determine the placement of sensors within the given spatial limitations, enabling the reconstruction of reactor flow fields and the creation of nuclear digital twins. Various design considerations are imposed, such as predetermined sensor locations, restricted areas within the reactor, a fixed number of sensors allocated to a specific region, or sensors positioned at a designated distance from one another. We develop a data-driven technique that integrates constraints into an optimization procedure for sensor placement, aiming to minimize reconstruction errors. Our approach employs a greedy algorithm that can optimize sensor locations on a grid, adhering to user-defined constraints. We demonstrate the near optimality of our algorithm by computing all possible configurations for selecting a certain number of sensors for a randomly generated state space system. In this work, the algorithm is demonstrated on the Out-of-Pile Testing and Instrumentation Transient Water Irradiation System (OPTI-TWIST) prototype vessel, which is electrically heated to mimic the neutronics effect of the Transient Reactor Test facility (TREAT) at Idaho National Laboratory (INL). The resulting sensor-based reconstruction of temperature within the OPTI-TWIST minimizes error, provides probabilistic bounds for noise-induced uncertainty and will finally be used for communication between the digital twin and experimental facility.
1.Rotation Group Synchronization via Quotient Manifold
Authors:Linglingzhi Zhu, Chong Li, Anthony Man-Cho So
Abstract: Rotation group $\mathcal{SO}(d)$ synchronization is an important inverse problem and has attracted intense attention from numerous application fields such as graph realization, computer vision, and robotics. In this paper, we focus on the least-squares estimator of rotation group synchronization with general additive noise models, which is a nonconvex optimization problem with manifold constraints. Unlike the phase/orthogonal group synchronization, there are limited provable approaches for solving rotation group synchronization. First, we derive improved estimation results of the least-squares/spectral estimator, illustrating the tightness and validating the existing relaxation methods of solving rotation group synchronization through the optimum of relaxed orthogonal group version under near-optimal noise level for exact recovery. Moreover, departing from the standard approach of utilizing the geometry of the ambient Euclidean space, we adopt an intrinsic Riemannian approach to study orthogonal/rotation group synchronization. Benefiting from a quotient geometric view, we prove the positive definite condition of quotient Riemannian Hessian around the optimum of orthogonal group synchronization problem, and consequently the Riemannian local error bound property is established to analyze the convergence rate properties of various Riemannian algorithms. As a simple and feasible method, the sequential convergence guarantee of the (quotient) Riemannian gradient method for solving orthogonal/rotation group synchronization problem is studied, and we derive its global linear convergence rate to the optimum with the spectral initialization. All results are deterministic without any probabilistic model.
2.Data-driven Approximation of Distributionally Robust Chance Constraints using Bayesian Credible Intervals
Authors:Zhiping Chen, Wentao Ma, Bingbing Ji
Abstract: The non-convexity and intractability of distributionally robust chance constraints make them challenging to cope with. From a data-driven perspective, we propose formulating it as a robust optimization problem to ensure that the distributionally robust chance constraint is satisfied with high probability. To incorporate available data and prior distribution knowledge, we construct ambiguity sets for the distributionally robust chance constraint using Bayesian credible intervals. We establish the congruent relationship between the ambiguity set in Bayesian distributionally robust chance constraints and the uncertainty set in a specific robust optimization. In contrast to most existent uncertainty set construction methods which are only applicable for particular settings, our approach provides a unified framework for constructing uncertainty sets under different marginal distribution assumptions, thus making it more flexible and widely applicable. Additionally, under the concavity assumption, our method provides strong finite sample probability guarantees for optimal solutions. The practicality and effectiveness of our approach are illustrated with numerical experiments on portfolio management and queuing system problems. Overall, our approach offers a promising solution to distributionally robust chance constrained problems and has potential applications in other fields.
3.Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models
Authors:Leonardo Galli, Holger Rauhut, Mark Schmidt
Abstract: Recent works have shown that line search methods can speed up Stochastic Gradient Descent (SGD) and Adam in modern over-parameterized settings. However, existing line searches may take steps that are smaller than necessary since they require a monotone decrease of the (mini-)batch objective function. We explore nonmonotone line search methods to relax this condition and possibly accept larger step sizes. Despite the lack of a monotonic decrease, we prove the same fast rates of convergence as in the monotone case. Our experiments show that nonmonotone methods improve the speed of convergence and generalization properties of SGD/Adam even beyond the previous monotone line searches. We propose a POlyak NOnmonotone Stochastic (PoNoS) method, obtained by combining a nonmonotone line search with a Polyak initial step size. Furthermore, we develop a new resetting technique that in the majority of the iterations reduces the amount of backtracks to zero while still maintaining a large initial step size. To the best of our knowledge, a first runtime comparison shows that the epoch-wise advantage of line-search-based methods gets reflected in the overall computational time.
4.A Gradient Descent-Ascent Method for Continuous-Time Risk-Averse Optimal Control
Authors:Gabriel Velho, Jean Auriol, Riccardo Bonalli
Abstract: In this paper, we consider continuous-time stochastic optimal control problems where the cost is evaluated through a coherent risk measure. We provide an explicit gradient descent-ascent algorithm which applies to problems subject to non-linear stochastic differential equations. More specifically, we leverage duality properties of coherent risk measures to relax the problem via a smooth min-max reformulation which induces artificial strong concavity in the max subproblem. We then formulate necessary conditions of optimality for this relaxed problem which we leverage to prove convergence of the gradient descent-ascent algorithm to candidate solutions of the original problem. Finally, we showcase the efficiency of our algorithm through numerical simulations involving trajectory tracking problems and highlight the benefit of favoring risk measures over classical expectation.
5.The chain control set of a linear control system
Authors:Adriano Da Silva
Abstract: In this paper, we analyze the chain control sets of linear control systems on connected Lie groups. Our main result shows that the compactness of the central subgroup associated with the drift is a necessary and sufficient condition to assure the uniqueness and compactness of the chain control set.
1.Distributed Random Reshuffling Methods with Improved Convergence
Authors:Kun Huang, Linli Zhou, Shi Pu
Abstract: This paper proposes two distributed random reshuffling methods, namely Gradient Tracking with Random Reshuffling (GT-RR) and Exact Diffusion with Random Reshuffling (ED-RR), to solve the distributed optimization problem over a connected network, where a set of agents aim to minimize the average of their local cost functions. Both algorithms invoke random reshuffling (RR) update for each agent, inherit favorable characteristics of RR for minimizing smooth nonconvex objective functions, and improve the performance of previous distributed random reshuffling methods both theoretically and empirically. Specifically, both GT-RR and ED-RR achieve the convergence rate of $O(1/[(1-\lambda)^{1/3}m^{1/3}T^{2/3}])$ in driving the (minimum) expected squared norm of the gradient to zero, where $T$ denotes the number of epochs, $m$ is the sample size for each agent, and $1-\lambda$ represents the spectral gap of the mixing matrix. When the objective functions further satisfy the Polyak-{\L}ojasiewicz (PL) condition, we show GT-RR and ED-RR both achieve $O(1/[(1-\lambda)mT^2])$ convergence rate in terms of the averaged expected differences between the agents' function values and the global minimum value. Notably, both results are comparable to the convergence rates of centralized RR methods (up to constant factors depending on the network topology) and outperform those of previous distributed random reshuffling algorithms. Moreover, we support the theoretical findings with a set of numerical experiments.
2.A Novel Sensor Design for a Cantilevered Mead-Marcus-type Sandwich Beam Model by the Order-reduction Technique
Authors:Ahmet Ozkan Ozer, Ahmet Kaan Aydin
Abstract: A novel space-discretized Finite Differences-based model reduction, introduced in (Liu,Guo,2020) is extended to the partial differential equations (PDE) model of a multi-layer Mead-Marcus-type sandwich beam with clamped-free boundary conditions. The PDE model describes transverse vibrations for a sandwich beam whose alternating outer elastic layers constrain viscoelastic core layers, which allow transverse shear. The major goal of this project is to design a single tip velocity sensor to control the overall dynamics on the beam. Since the spectrum of the PDE can not be constructed analytically, the so-called multipliers approach is adopted to prove that the PDE model is exactly observable with sub-optimal observation time. Next, the PDE model is reduced by the ``order-reduced'' Finite-Differences technique. This method does not require any type of filtering though the exact observability as $h\to 0$ is achieved by a constraint on the material constants. The main challenge here is the strong coupling of the shear dynamics of the middle layer with overall bending dynamics. This complicates the absorption of coupling terms in the discrete energy estimates. This is sharply different from a single-layer (Euler-Bernoulli) beam.
3.Optimal Algorithms for Stochastic Bilevel Optimization under Relaxed Smoothness Conditions
Authors:Xuxing Chen, Tesi Xiao, Krishnakumar Balasubramanian
Abstract: Stochastic Bilevel optimization usually involves minimizing an upper-level (UL) function that is dependent on the arg-min of a strongly-convex lower-level (LL) function. Several algorithms utilize Neumann series to approximate certain matrix inverses involved in estimating the implicit gradient of the UL function (hypergradient). The state-of-the-art StOchastic Bilevel Algorithm (SOBA) [16] instead uses stochastic gradient descent steps to solve the linear system associated with the explicit matrix inversion. This modification enables SOBA to match the lower bound of sample complexity for the single-level counterpart in non-convex settings. Unfortunately, the current analysis of SOBA relies on the assumption of higher-order smoothness for the UL and LL functions to achieve optimality. In this paper, we introduce a novel fully single-loop and Hessian-inversion-free algorithmic framework for stochastic bilevel optimization and present a tighter analysis under standard smoothness assumptions (first-order Lipschitzness of the UL function and second-order Lipschitzness of the LL function). Furthermore, we show that by a slight modification of our approach, our algorithm can handle a more general multi-objective robust bilevel optimization problem. For this case, we obtain the state-of-the-art oracle complexity results demonstrating the generality of both the proposed algorithmic and analytic frameworks. Numerical experiments demonstrate the performance gain of the proposed algorithms over existing ones.
4.Comparing the Methods of Alternating and Simultaneous Projections for Two Subspaces
Authors:Simeon Reich, Rafał Zalas
Abstract: We study the well-known methods of alternating and simultaneous projections when applied to two nonorthogonal linear subspaces of a real Euclidean space. Assuming that both of the methods have a common starting point chosen from either one of the subspaces, we show that the method of alternating projections converges significantly faster than the method of simultaneous projections. On the other hand, we provide examples of subspaces and starting points, where the method of simultaneous projections outperforms the method of alternating projections.
5.Stability Analysis of Trajectories on Manifolds with Applications to Observer and Controller Design
Authors:Dongjun Wu, Bowen Yi, Anders Rantzer
Abstract: This paper examines the local exponential stability (LES) of trajectories for nonlinear systems on Riemannian manifolds. We present necessary and sufficient conditions for LES of a trajectory on a Riemannian manifold by analyzing the complete lift of the system along the given trajectory. These conditions are coordinate-free which reveal fundamental relationships between exponential stability and incremental stability in a local sense. We then apply these results to design tracking controllers and observers for Euler-Lagrangian systems on manifolds; a notable advantage of our design is that it visibly reveals the effect of curvature on system dynamics and hence suggests compensation terms in the controller and observer. Additionally, we revisit some well-known intrinsic observer problems using our proposed method, which largely simplifies the analysis compared to existing results.
1.A Lagrangian-Based Method with "False Penalty'' for Linearly Constrained Nonconvex Composite Optimization
Authors:Jong Gwang Kim
Abstract: We introduce a primal-dual framework for solving linearly constrained nonconvex composite optimization problems. Our approach is based on a newly developed Lagrangian, which incorporates \emph{false penalty} and dual smoothing terms. This new Lagrangian enables us to develop a simple first-order algorithm that converges to a stationary solution under standard assumptions. We further establish global convergence, provided that the objective function satisfies the Kurdyka-{\L}ojasiewicz property. Our method provides several advantages: it simplifies the treatment of constraints by effectively bounding the multipliers without boundedness assumptions on the dual iterates; it guarantees global convergence without requiring the surjectivity assumption on the linear operator; and it is a single-loop algorithm that does not involve solving penalty subproblems, achieving an iteration complexity of $\mathcal{O}(1/\epsilon^2)$ to find an $\epsilon$-stationary solution. Preliminary experiments on test problems demonstrate the practical efficiency and robustness of our method.
2.A gradient projection method for semi-supervised hypergraph clustering problems
Authors:Jingya Chang, Dongdong Liu, Min Xi
Abstract: Semi-supervised clustering problems focus on clustering data with labels. In this paper,we consider the semi-supervised hypergraph problems. We use the hypergraph related tensor to construct an orthogonal constrained optimization model. The optimization problem is solved by a retraction method, which employs the polar decomposition to map the gradient direction in the tangent space to the Stefiel manifold. A nonmonotone curvilinear search is implemented to guarantee reduction in the objective function value. Convergence analysis demonstrates that the first order optimality condition is satisfied at the accumulation point. Experiments on synthetic hypergraph and hypergraph given by real data demonstrate the effectivity of our method.
3.A Passivity-Based Method for Accelerated Convex Optimisation
Authors:Namhoon Cho, Hyo-Sang Shin
Abstract: This study presents a constructive methodology for designing accelerated convex optimisation algorithms in continuous-time domain. The two key enablers are the classical concept of passivity in control theory and the time-dependent change of variables that maps the output of the internal dynamic system to the optimisation variables. The Lyapunov function associated with the optimisation dynamics is obtained as a natural consequence of specifying the internal dynamics that drives the state evolution as a passive linear time-invariant system. The passivity-based methodology provides a general framework that has the flexibility to generate convex optimisation algorithms with the guarantee of different convergence rate bounds on the objective function value. The same principle applies to the design of online parameter update algorithms for adaptive control by re-defining the output of internal dynamics to allow for the feedback interconnection with tracking error dynamics.
4.Stabilization and Spill-Free Transfer of Viscous Liquid in a Tank
Authors:Iasson Karafyllis, Miroslav Krstic
Abstract: Flow control occupies a special place in the fields of partial differential equations (PDEs) and control theory, where the complex behavior of solutions of nonlinear dynamics in very high dimension is not just to be understood but also to be assigned specific desired properties, by feedback control. Among several benchmark problems in flow control, the liquid-tank problem is particularly attractive as a research topic. In the liquid-tank problem the objective is to move a tank filled with liquid, suppress the nonlinear oscillations of the liquid in the process, bring the tank and liquid to rest, and avoid liquid spillage in the process. In other words, this is a problem of nonlinear PDE stabilization subject to state constraints. This review article focuses only on recent results on liquid-tank stabilization for viscous liquids. All possible cases are studied: with and without friction from the tank walls, with and without surface tension. Moreover, novel results are provided for the linearization of the tank-liquid system. The linearization of the tank-liquid system gives a high-order PDE which is a combination of a wave equation with Kelvin-Voigt damping and an Euler-Bernoulli beam equation. The feedback design methodology presented in the article is based on Control Lyapunov Functionals (CLFs), suitably extended from the CLF methodology for ODEs to the infinite-dimensional case. The CLFs proposed are modifications and augmentations of the total energy functionals for the tank-liquid system, so that the dissipative effects of viscosity, friction, and surface tension are captured and additional dissipation by feedback is made relatively easy. The article closes with an extensive list of open problems.
5.Graph-Based Conditions for Feedback Stabilization of Switched and LPV Systems
Authors:Matteo Della Rossa, Thiago Alves Lima, Marc Jungers, Raphaël M. Jungers
Abstract: This paper presents novel stabilizability conditions for switched linear systems with arbitrary and uncontrollable underlying switching signals. We distinguish and study two particular settings: i) the \emph{robust} case, in which the active mode is completely unknown and unobservable, and ii) the \emph{mode-dependent} case, in which the controller depends on the current active switching mode. The technical developments are based on graph-theory tools, relying in particular on the path-complete Lyapunov functions framework. The main idea is to use directed and labeled graphs to encode Lyapunov inequalities to design robust and mode-dependent piecewise linear state-feedback controllers. This results in novel and flexible conditions, with the particular feature of being in the form of linear matrix inequalities (LMIs). Our technique thus provides a first controller-design strategy allowing piecewise linear feedback maps and piecewise quadratic (control) Lyapunov functions by means of semidefinite programming. Numerical examples illustrate the application of the proposed techniques, the relations between the graph order, the robustness, and the performance of the closed loop.
6.Regularized Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity
Authors:Runyu Zhang, Yang Hu, Na Li
Abstract: This paper focuses on reinforcement learning for the regularized robust Markov decision process (MDP) problem, an extension of the robust MDP framework. We first introduce the risk-sensitive MDP and establish the equivalence between risk-sensitive MDP and regularized robust MDP. This equivalence offers an alternative perspective for addressing the regularized RMDP and enables the design of efficient learning algorithms. Given this equivalence, we further derive the policy gradient theorem for the regularized robust MDP problem and prove the global convergence of the exact policy gradient method under the tabular setting with direct parameterization. We also propose a sample-based offline learning algorithm, namely the robust fitted-Z iteration (RFZI), for a specific regularized robust MDP problem with a KL-divergence regularization term and analyze the sample complexity of the algorithm. Our results are also supported by numerical simulations.
7.Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs
Authors:Dongsheng Ding, Chen-Yu Wei, Kaiqing Zhang, Alejandro Ribeiro
Abstract: We study the problem of computing an optimal policy of an infinite-horizon discounted constrained Markov decision process (constrained MDP). Despite the popularity of Lagrangian-based policy search methods used in practice, the oscillation of policy iterates in these methods has not been fully understood, bringing out issues such as violation of constraints and sensitivity to hyper-parameters. To fill this gap, we employ the Lagrangian method to cast a constrained MDP into a constrained saddle-point problem in which max/min players correspond to primal/dual variables, respectively, and develop two single-time-scale policy-based primal-dual algorithms with non-asymptotic convergence of their policy iterates to an optimal constrained policy. Specifically, we first propose a regularized policy gradient primal-dual (RPG-PD) method that updates the policy using an entropy-regularized policy gradient, and the dual via a quadratic-regularized gradient ascent, simultaneously. We prove that the policy primal-dual iterates of RPG-PD converge to a regularized saddle point with a sublinear rate, while the policy iterates converge sublinearly to an optimal constrained policy. We further instantiate RPG-PD in large state or action spaces by including function approximation in policy parametrization, and establish similar sublinear last-iterate policy convergence. Second, we propose an optimistic policy gradient primal-dual (OPG-PD) method that employs the optimistic gradient method to update primal/dual variables, simultaneously. We prove that the policy primal-dual iterates of OPG-PD converge to a saddle point that contains an optimal constrained policy, with a linear rate. To the best of our knowledge, this work appears to be the first non-asymptotic policy last-iterate convergence result for single-time-scale algorithms in constrained MDPs.
8.Closed-form expressions for the pure time delay in terms of the input and output Laguerre spectra
Authors:Alexander Medvedev
Abstract: The pure time delay operator is considered in continuous and discrete time under the assumption of the input signal being integrable (summable) with square. By making use of a discrete convolution operator with polynomial Markov parameters, a common framework for handling the continuous and discrete case is set. Closed-form expressions for the delay value are derived in terms of the Laguerre spectra of the output and input signals. The expressions hold for any feasible value of the Laguerre parameter and can be utilized for e.g. building time-delay estimators that allow for non-persistent input. A simulation example is provided to illustrate the principle of Laguerre-domain time delay modeling and analysis.
9.Projection-Free Methods for Solving Nonconvex-Concave Saddle Point Problems
Authors:Morteza Boroun, Erfan Yazdandoost Hamedani, Afrooz Jalilzadeh
Abstract: In this paper, we investigate a class of constrained saddle point (SP) problems where the objective function is nonconvex-concave and smooth. This class of problems has wide applicability in machine learning, including robust multi-class classification and dictionary learning. Several projection-based primal-dual methods have been developed for tackling this problem, however, the availability of methods with projection-free oracles remains limited. To address this gap, we propose efficient single-loop projection-free methods reliant on first-order information. In particular, using regularization and nested approximation techniques we propose a primal-dual conditional gradient method that solely employs linear minimization oracles to handle constraints. Assuming that the constraint set in the maximization is strongly convex our method achieves an $\epsilon$-stationary solution within $\mathcal{O}(\epsilon^{-6})$ iterations. When the projection onto the constraint set of maximization is easy to compute, we propose a one-sided projection-free method that achieves an $\epsilon$-stationary solution within $\mathcal{O}(\epsilon^{-4})$ iterations. Moreover, we present improved iteration complexities of our methods under a strong concavity assumption. To the best of our knowledge, our proposed algorithms are among the first projection-free methods with convergence guarantees for solving nonconvex-concave SP problems.
1.Randomized Robust Price Optimization
Authors:Xinyi Guan, Velibor V. Mišić
Abstract: The robust multi-product pricing problem is to determine the prices of a collection of products so as to maximize the worst-case revenue, where the worst case is taken over an uncertainty set of demand models that the firm expects could be realized in practice. A tacit assumption in this approach is that the pricing decision is a deterministic decision: the prices of the products are fixed and do not vary. In this paper, we consider a randomized approach to robust pricing, where a decision maker specifies a distribution over potential price vectors so as to maximize its worst-case revenue over an uncertainty set of demand models. We formally define this problem -- the randomized robust price optimization problem -- and analyze when a randomized price scheme performs as well as a deterministic price vector, and identify cases in which it can yield a benefit. We also propose two solution methods for obtaining an optimal randomization scheme over a discrete set of candidate price vectors based on constraint generation and double column generation, respectively, and show how these methods are applicable for common demand models, such as the linear, semi-log and log-log demand models. We numerically compare the randomized approach against the deterministic approach on a variety of synthetic and real problem instances; on synthetic instances, we show that the improvement in worst-case revenue can be as much as 1300%, while on real data instances derived from a grocery retail scanner dataset, the improvement can be as high as 92%.
2.Linear convergence of Nesterov-1983 with the strong convexity
Authors:Bowen Li, Bin Shi, Ya-xiang Yuan
Abstract: For modern gradient-based optimization, a developmental landmark is Nesterov's accelerated gradient descent method, which is proposed in [Nesterov, 1983], so shorten as Nesterov-1983. Afterward, one of the important progresses is its proximal generalization, named the fast iterative shrinkage-thresholding algorithm (FISTA), which is widely used in image science and engineering. However, it is unknown whether both Nesterov-1983 and FISTA converge linearly on the strongly convex function, which has been listed as the open problem in the comprehensive review [Chambolle and Pock, 2016, Appendix B]. In this paper, we answer this question by the use of the high-resolution differential equation framework. Along with the phase-space representation previously adopted, the key difference here in constructing the Lyapunov function is that the coefficient of the kinetic energy varies with the iteration. Furthermore, we point out that the linear convergence of both the two algorithms above has no dependence on the parameter $r$ on the strongly convex function. Meanwhile, it is also obtained that the proximal subgradient norm converges linearly.
3.On the finitary content of Dykstra's cyclic projections algorithm
Authors:Pedro Pinto
Abstract: We study the asymptotic behaviour of the well-known Dykstra's algorithm through the lens of proof-theoretical techniques. We provide an elementary proof for the convergence of Dykstra's algorithm in which the standard argument is stripped to its central features and where the original compactness principles are circumvented, additionally providing highly uniform primitive recursive rates of metastability in a full general setting. Moreover, under an additional assumption, we are even able to obtain effective general rates of convergence. We argue that such additional condition is actually necessary for the existence of general uniform rates of convergence.
4.Barzilai-Borwein Proximal Gradient Methods for Multiobjective Composite Optimization Problems with Improved Linear Convergence
Authors:Jian Chen, Liping Tang, Xinmin Yang
Abstract: Over the past two decades, multiobejective gradient descent methods have received increasing attention due to the seminal work of Fliege and Svaiter. Recently, Chen et al. pointed out that imbalances among objective functions can lead to a small stepsize in Fliege and Svaiter's method, which significantly decelerates the convergence. To address the issue, Chen et al. propose the Barzilai-Borwein descent method for multiobjective optimization (BBDMO). Their work demonstrated that BBDMO achieves better stepsize and performance compared to Fliege and Svaiter's method. However, a theoretical explanation for the superiority of BBDMO over the previous method has been open. In this paper, we extend Chen et al.'s method to composite cases and propose two types of Barzilai-Borwein proximal gradient methods (BBPGMO). Moreover, we prove that the convergence rates of BBPGMO are $O(\frac{1}{\sqrt{k}})$, $O(\frac{1}{k})$, and $O(r^{k})(0<r<1)$ for non-convex, convex, and strongly convex problems, respectively. Notably, the linear rate $r$ in our proposed method is smaller than the previous rates of first-order methods for multiobjective optimization, which directly indicates its improved performance. We further validate these theoretical results through numerical experiments.
5.Version 2.0 -- cashocs: A Computational, Adjoint-Based Shape Optimization and Optimal Control Software
Authors:Sebastian Blauth
Abstract: In this paper, we present version 2.0 of cashocs. Our software automates the solution of PDE constrained optimization problems for design optimization and optimal control. Since its inception, many new features and useful tools have been added to cashocs, making it even more flexible and efficient. The most significant additions are a framework for space mapping, the ability to solve topology optimization problems with a level-set approach, the support for parallelism via MPI, and the ability to handle additional (state) constraints. In this software update, we describe the key additions to cashocs, which is now even better-suited for solving complex PDE constrained optimization problems.
6.Distributionally Robust Airport Ground Holding Problem under Wasserstein Ambiguity Sets
Authors:Haochen Wu, Max Z. Li
Abstract: The airport ground holding problem seeks to minimize flight delay costs due to reductions in the capacity of airports. However, the critical input of future airport capacities is often difficult to predict, presenting a challenging yet realistic setting. Even when capacity predictions provide a distribution of possible capacity scenarios, such distributions may themselves be uncertain (e.g., distribution shifts). To address the problem of designing airport ground holding policies under distributional uncertainty, we formulate and solve the airport ground holding problem using distributionally robust optimization (DRO). We address the uncertainty in the airport capacity distribution by defining ambiguity sets based on the Wasserstein distance metric. We propose reformulations which integrate the ambiguity sets into the airport ground holding problem structure, and discuss dicretization properties of the proposed model. We discuss comparisons (via numerical experiments) between ground holding policies and optimized costs derived through the deterministic, stochastic, and distributionally robust airport ground holding problems. Our experiments show that the DRO model outperforms the stochastic models when there is a significant difference between the empirical airport capacity distribution and the realized airport capacity distribution. We note that DRO can be a valuable tool for decision-makers seeking to design airport ground holding policies, particularly when the available data regarding future airport capacities are highly uncertain.
7.On integrality in semidefinite programming for discrete optimization
Authors:Frank de Meijer, Renata Sotirov
Abstract: It is well-known that by adding integrality constraints to the semidefinite programming (SDP) relaxation of the max-cut problem, the resulting integer semidefinite program is an exact formulation of the problem. In this paper we show similar results for a wide variety of discrete optimization problems for which SDP relaxations have been derived. Based on a comprehensive study on discrete positive semidefinite matrices, we follow a generic approach to derive mixed-integer semidefinite programming (MISDP) formulations of binary quadratically constrained quadratic programs and binary quadratic matrix programs. Applying a problem-specific approach, we derive more compact MISDP formulations of several problems, such as the quadratic assignment problem, the graph partition problem and the integer matrix completion problem. We also show that several structured problems allow for novel compact MISDP formulations through the notion of association schemes. Complementary to the recent advances on algorithmic aspects related to MISDP, this work opens new perspectives on solution approaches for the here considered problems.
8.A Distributed Optimization Framework to Regulate the Electricity Consumption of a Residential Neighborhood
Authors:Erhan Can Ozcan, Ioannis Ch. Paschalidis
Abstract: Increased variability of electricity generation due to renewable sources requires either large amounts of stand-by production capacity or some form of demand response. For residential loads, space heating and cooling, water heating, electric vehicle charging, and routine appliances make up the bulk of the electricity consumption. Controlling these loads can reduce the peak load of a residential neighborhood and facilitate matching supply with demand. However, maintaining user comfort is important for ensuring user participation to such a program. This paper formulates a novel mixed integer linear programming problem to control the overall electricity consumption of a residential neighborhood by considering the users' comfort. To efficiently solve the problem for communities involving a large number of homes, a distributed optimization framework based on the Dantzig-Wolfe decomposition technique is developed. We demonstrate the load shaping capacity and the computational performance of the proposed optimization framework in a simulated environment.
1.Optimization on product manifolds under a preconditioned metric
Authors:Bin Gao, Renfeng Peng, Ya-xiang Yuan
Abstract: Since optimization on Riemannian manifolds relies on the chosen metric, it is appealing to know that how the performance of a Riemannian optimization method varies with different metrics and how to exquisitely construct a metric such that a method can be accelerated. To this end, we propose a general framework for optimization problems on product manifolds where the search space is endowed with a preconditioned metric, and we develop the Riemannian gradient descent and Riemannian conjugate gradient methods under this metric. Specifically, the metric is constructed by an operator that aims to approximate the diagonal blocks of the Riemannian Hessian of the cost function, which has a preconditioning effect. We explain the relationship between the proposed methods and the variable metric methods, and show that various existing methods, e.g., the Riemannian Gauss--Newton method, can be interpreted by the proposed framework with specific metrics. In addition, we tailor new preconditioned metrics and adapt the proposed Riemannian methods to the canonical correlation analysis and the truncated singular value decomposition problems, and we propose the Gauss--Newton method to solve the tensor ring completion problem. Numerical results among these applications verify that a delicate metric does accelerate the Riemannian optimization methods.
2.Optimal control of port-Hamiltonian systems: energy, entropy, and exergy
Authors:Friedrich Philipp, Manuel Schaller, Karl Worthmann, Timm Faulwasser, Bernhard Maschke
Abstract: We consider irreversible and coupled reversible-irreversible nonlinear port-Hamiltonian systems and the respective sets of thermodynamic equilibria. In particular, we are concerned with optimal state transitions and output stabilization on finite-time horizons. We analyze a class of optimal control problems, where the performance functional can be interpreted as a linear combination of energy supply, entropy generation, or exergy supply. Our results establish the integral turnpike property towards the set of thermodynamic equilibria providing a rigorous connection of optimal system trajectories to optimal steady states. Throughout the paper, we illustrate our findings by means of two examples: a network of heat exchangers and a gas-piston system.
3.iNALM: An inexact Newton Augmented Lagrangian Method for Zero-One Composite Optimization
Authors:Penghe Zhang, Naihua Xiu, Hou-Duo Qi
Abstract: Zero-One Composite Optimization (0/1-COP) is a prototype of nonsmooth, nonconvex optimization problems and it has attracted much attention recently. The augmented Lagrangian Method (ALM) has stood out as a leading methodology for such problems. The main purpose of this paper is to extend the classical theory of ALM from smooth problems to 0/1-COP. We propose, for the first time, second-order optimality conditions for 0/1-COP. In particular, under a second-order sufficient condition (SOSC), we prove the R-linear convergence rate of the proposed ALM. In order to identify the subspace used in SOSC, we employ the proximal operator of the 0/1-loss function, leading to an active-set identification technique. Built around this identification process, we design practical stopping criteria for any algorithm to be used for the subproblem of ALM. We justify that Newton's method is an ideal candidate for the subproblem and it enjoys both global and local quadratic convergence. Those considerations result in an inexact Newton ALM (iNALM). The method of iNALM is unique in the sense that it is active-set based, it is inexact (hence more practical), and SOSC plays an important role in its R-linear convergence analysis. The numerical results on both simulated and real datasets show the fast running speed and high accuracy of iNALM when compared with several leading solvers.
4.Distributionally Robust Stratified Sampling for Stochastic Simulations with Multiple Uncertain Input Models
Authors:Seung Min Baik, Eunshin Byon, Young Myoung Ko
Abstract: This paper presents a robust version of the stratified sampling method when multiple uncertain input models are considered for stochastic simulation. Various variance reduction techniques have demonstrated their superior performance in accelerating simulation processes. Nevertheless, they often use a single input model and further assume that the input model is exactly known and fixed. We consider more general cases in which it is necessary to assess a simulation's response to a variety of input models, such as when evaluating the reliability of wind turbines under nonstationary wind conditions or the operation of a service system when the distribution of customer inter-arrival time is heterogeneous at different times. Moreover, the estimation variance may be considerably impacted by uncertainty in input models. To address such nonstationary and uncertain input models, we offer a distributionally robust (DR) stratified sampling approach with the goal of minimizing the maximum of worst-case estimator variances among plausible but uncertain input models. Specifically, we devise a bi-level optimization framework for formulating DR stochastic problems with different ambiguity set designs, based on the $L_2$-norm, 1-Wasserstein distance, parametric family of distributions, and distribution moments. In order to cope with the non-convexity of objective function, we present a solution approach that uses Bayesian optimization. Numerical experiments and the wind turbine case study demonstrate the robustness of the proposed approach.
5.Kinetic based optimization enhanced by genetic dynamics
Authors:Giacomo Albi, Federica Ferrarese, Claudia Totzeck
Abstract: We propose and analyse a variant of the recently introduced kinetic based optimization method that incorporates ideas like survival-of-the-fittest and mutation strategies well-known from genetic algorithms. Thus, we provide a first attempt to reach out from the class of consensus/kinetic-based algorithms towards genetic metaheuristics. Different generations of genetic algorithms are represented via two species identified with different labels, binary interactions are prescribed on the particle level and then we derive a mean-field approximation in order to analyse the method in terms of convergence. Numerical results underline the feasibility of the approach and show in particular that the genetic dynamics allows to improve the efficiency, of this class of global optimization methods in terms of computational cost.
6.Two sided ergodic singular control and mean field game for diffusions
Authors:Sören Christensen, Ernesto Mordecki, Facundo Oliú Eguren
Abstract: Consider two independent controlled linear diffusions with the same dynamics and the same ergodic controls, the first corresponding to an individual player, the second to the market. Let us also consider a cost function that depends on the first diffusion and the expectation of the second one. In this framework, we study the mean-field game consisting in finding the equilibrium points where the controls chosen by the player to minimize an ergodic integrated cost coincide with the market controls. We first show that in the control problem, without market dependence, the best policy is to reflect the process within two boundaries. We use these results to get criteria for the optimal and market controls to coincide (i.e., equilibrium existence), and give a pair of nonlinear equations to find these equilibrium points. We also get criteria for the existence and uniqueness of equilibrium points for the mean-field games under study. These results are illustrated through several examples where the existence and uniqueness of the equilibrium points depend on the values of the parameters defining the underlying diffusion.
7.A Score-based Nonlinear Filter for Data Assimilation
Authors:Feng Bao, Zezhong Zhang, Guannan Zhang
Abstract: We introduce a score-based generative sampling method for solving the nonlinear filtering problem with robust accuracy. A major drawback of existing nonlinear filtering methods, e.g., particle filters, is the low stability. To overcome this issue, we adopt the diffusion model framework to solve the nonlinear filtering problem. In stead of storing the information of the filtering density in finite number of Monte Carlo samples, in the score-based filter we store the information of the filtering density in the score model. Then, via the reverse-time diffusion sampler, we can generate unlimited samples to characterize the filtering density. Moreover, with the powerful expressive capabilities of deep neural networks, it has been demonstrated that a well trained score in diffusion model can produce samples from complex target distributions in very high dimensional spaces. Extensive numerical experiments show that our score-based filter could potentially address the curse of dimensionality in very high dimensional problems.
1.Equitable Optimization of Patient Re-allocation and Temporary Facility Placement to Maximize Critical Care System Resilience in Disasters
Authors:Chia-Fu Liu, Ali Mostafavi
Abstract: End-stage renal disease patients face a complicated sociomedical situation and rely on various forms of infrastructure for life-sustaining treatment. Disruption of these infrastructures during disasters poses a major threat to their lives. To improve patient access to dialysis treatment, there is a need to assess the potential threat to critical care facilities from hazardous events. In this study, we propose optimization models to solve critical care system resilience problems including patient and medical resource allocation. We use human mobility data in the context of Harris County (Texas) to assess patient access to critical care facilities, dialysis centers in this study, under the simulated hazard impacts, and we propose models for patient re-allocation and temporary medical facility placement to improve critical care system resilience in an equitable manner. The results show (1) the capability of the optimization model in efficient patient re-allocation to alleviate disrupted access to dialysis facilities; (2) the importance of large facilities in maintaining the functioning of the system. The critical care system, particularly the network of dialysis centers, is heavily reliant on a few larger facilities, making it susceptible to targeted disruption. (3) The consideration of equity in the optimization model formulation reduces access loss for vulnerable populations in the simulated scenarios. (4) The proposed temporary facilities placement could improve access for the vulnerable population, thereby improving the equity of access to critical care facilities in disaster. The proposed patient re-allocation model and temporary facilities placement can serve as a data-driven and analytic-based decision support tool for public health and emergency management plans to reduce the loss of access and disrupted access to critical care facilities and would reduce the dire social costs.
2.Efficient Algorithm for Solving Hyperbolic Programs
Authors:Yichuan Deng, Zhao Song, Lichen Zhang, Ruizhe Zhang
Abstract: Hyperbolic polynomials is a class of real-roots polynomials that has wide range of applications in theoretical computer science. Each hyperbolic polynomial also induces a hyperbolic cone that is of particular interest in optimization due to its generality, as by choosing the polynomial properly, one can easily recover the classic optimization problems such as linear programming and semidefinite programming. In this work, we develop efficient algorithms for hyperbolic programming, the problem in each one wants to minimize a linear objective, under a system of linear constraints and the solution must be in the hyperbolic cone induced by the hyperbolic polynomial. Our algorithm is an instance of interior point method (IPM) that, instead of following the central path, it follows the central Swath, which is a generalization of central path. To implement the IPM efficiently, we utilize a relaxation of the hyperbolic program to a quadratic program, coupled with the first four moments of the hyperbolic eigenvalues that are crucial to update the optimization direction. We further show that, given an evaluation oracle of the polynomial, our algorithm only requires $O(n^2d^{2.5})$ oracle calls, where $n$ is the number of variables and $d$ is the degree of the polynomial, with extra $O((n+m)^3 d^{0.5})$ arithmetic operations, where $m$ is the number of constraints.
3.Two-step inertial Bregman proximal alternating linearized minimization algorithm for nonconvex and nonsmooth problems
Authors:Chenzheng Guo, Jing Zhao
Abstract: In this paper, we study an algorithm for solving a class of nonconvex and nonsmooth nonseparable optimization problems. Based on proximal alternating linearized minimization (PALM), we propose a new iterative algorithm which combines two-step inertial extrapolation and Bregman distance. By constructing appropriate benefit function, with the help of Kurdyka--{\L}ojasiewicz property we establish the convergence of the whole sequence generated by proposed algorithm. We apply the algorithm to signal recovery, quadratic fractional programming problem and show the effectiveness of proposed algorithm.
4.Convergence to consensus results for Hegselmann-Krause type models with attractive-lacking interaction
Authors:Elisa Continelli, Cristina Pignotti
Abstract: In this paper, we analyze a Hegselmann-Krause opinion formation model with attractive-lacking interaction. More precisely, we investigate the situation in which the individuals involved in an opinion formation process interact among themselves but can eventually suspend the exchange of information among each other at some times. Under quite general assumptions, we prove the exponential convergence to consensus for the Hegselmann-Krause model in presence of possible lack of interaction. We then extend the analysis to an analogous model in presence of time delays.
5.Adaptive Stochastic Optimization Algorithms for Problems with Biased Oracles
Authors:Yin Liu, Sam Davanloo Tajbakhsh
Abstract: Motivated by multiple emerging applications in machine learning, we consider an optimization problem in a general form where the gradient of the objective is only available through a biased stochastic oracle. We assume the bias magnitude can be controlled by a parameter, however, lower bias requires more computation/samples. For instance, for two applications on stochastic composition optimization and policy optimization for infinite-horizon Markov decision processes, we show that the bias follows a power law and exponential decay, respectively, as functions of their corresponding bias control parameters. For problems with such gradient oracles, the paper proposes two stochastic algorithms that adaptively adjust the bias control parameter throughout the iterations. We analyze the nonasymptotic performance of the proposed algorithms in the nonconvex regime and establish $\mathcal{O}(\epsilon^{-4})$ and (optimal) $\mathcal{O}(\epsilon^{-3})$ sample complexity to obtain an $\epsilon$-stationary point. Finally, we numerically evaluate the performance of the proposed algorithms over the two applications.
6.Galerkin-like method for integro-differential inclusions with application to state-dependent sweeping processes
Authors:Pedro Pérez-Aros, Manuel Torres-Valdebenito, Emilio Vilches
Abstract: In this paper, we develop the Galerkin-like method to deal with first-order integro-differential inclusions. Under compactness or monotonicity conditions, we obtain new results for the existence of solutions for this class of problems, which generalize existing results in the literature and give new insights for differential inclusions with an unbounded right-hand side. The effectiveness of the proposed approach is illustrated by providing new results for nonconvex state-dependent integro-differential sweeping processes, where the right-hand side is unbounded, and the classical theory of differential inclusions is not applicable. It is the first result of this kind. The paper ends with an application to the existence of an optimal control problem governed by an integro-differential inclusion in finite dimensions.
7.Globally convergent homotopies for discrete-time optimal control
Authors:Willem Esterhuizen, Kathrin Flaßkamp, Matthias Hoffmann, Karl Worthmann
Abstract: Homotopy methods are attractive due to their capability of solving difficult optimization and optimal control problems. The underlying idea is to construct a homotopy, which may be considered as a continuous (zero) curve between the difficult original problem and a related, comparatively-easy one. Then, the solution of the easier one is continuously perturbed along the zero curve towards the desired solution of the difficult problem. We propose a methodology for the systematic construction of such zero curves for discrete-time optimal control problems drawing upon the theory of globally-convergent homotopies for nonlinear programs. This framework ensures that for almost every easy solution there exists a suitable homotopy path that is, in addition, numerically tractable. We demonstrate the results by solving a difficult path planning problem.
8.Symmetry & Critical Points for Symmetric Tensor Decompositions Problems
Authors:Yossi Arjevani, Gal Vinograd
Abstract: We consider the non-convex optimization problem associated with the decomposition of a real symmetric tensor into a sum of rank one terms. Use is made of the rich symmetry structure to derive Puiseux series representations of families of critical points, and so obtain precise analytic estimates on the critical values and the Hessian spectrum. The sharp results make possible an analytic characterization of various geometric obstructions to local optimization methods, revealing in particular a complex array of saddles and local minima which differ by their symmetry, structure and analytic properties. A desirable phenomenon, occurring for all critical points considered, concerns the index of a point, i.e., the number of negative Hessian eigenvalues, increasing with the value of the objective function. Lastly, a Newton polytope argument is used to give a complete enumeration of all critical points of fixed symmetry, and it is shown that contrarily to the set of global minima which remains invariant under different choices of tensor norms, certain families of non-global minima emerge, others disappear.
1.Convergence Rates of the Regularized Optimal Transport : Disentangling Suboptimality and Entropy
Authors:Hugo Malamut CEREMADE, Maxime Sylvestre CEREMADE
Abstract: We study the convergence of the transport plans $\gamma$$\epsilon$ towards $\gamma$0 as well as the cost of the entropy-regularized optimal transport (c, $\gamma$$\epsilon$) towards (c, $\gamma$0) as the regularization parameter $\epsilon$ vanishes in the setting of finite entropy marginals. We show that under the assumption of infinitesimally twisted cost and compactly supported marginals the distance W2($\gamma$$\epsilon$, $\gamma$0) is asymptotically greater than C $\sqrt$ $\epsilon$ and the suboptimality (c, $\gamma$$\epsilon$) -- (c, $\gamma$0) is of order $\epsilon$. In the quadratic cost case the compactness assumption is relaxed into a moment of order 2 + $\delta$ assumption. Moreover, in the case of a Lipschitz transport map for the non-regularized problem, the distance W2($\gamma$$\epsilon$, $\gamma$0) converges to 0 at rate $\sqrt$ $\epsilon$. Finally, if in addition the marginals have finite Fisher information, we prove (c, $\gamma$$\epsilon$) -- (c, $\gamma$0) $\sim$ d$\epsilon$/2 and we provide a companion expansion of H($\gamma$$\epsilon$). These results are achieved by disentangling the role of the cost and the entropy in the regularized problem. Contents
2.Sensitivity Analysis in Parametric Convex Vector Optimization
Authors:Duong Thi Viet An, Le Thanh Tung
Abstract: In this paper, sensitivity analysis of the efficient sets in parametric convex vector optimization is considered. Namely, the perturbation, weak perturbation, and proper perturbation maps are defined as set-valued maps. We establish the formulas for computing the Fr\'{e}chet coderivative of the profile of the above three kinds of perturbation maps. Because of the convexity assumptions, the conditions set are fairly simple if compared to those in the general case. In addition, our conditions are stated directly on the data of the problem. It is worth emphasizing that our approach is based on convex analysis tools which are different from those in the general case.
3.Towards continuous-time MPC: a novel trajectory optimization algorithm
Authors:Souvik Das, Siddhartha Ganguly, Muthyala Anjali, Debasish Chatterjee
Abstract: This article introduces a numerical algorithm that serves as a preliminary step toward solving continuous-time model predictive control (MPC) problems directly without explicit time-discretization. The chief ingredients of the underlying optimal control problem (OCP) are a linear time-invariant system, quadratic instantaneous and terminal cost functions, and convex path constraints. The thrust of the method involves finitely parameterizing the admissible space of control trajectories and solving the OCP satisfying the given constraints at every time instant in a tractable manner without explicit time-discretization. The ensuing OCP turns out to be a convex semi-infinite program (SIP), and some recently developed results are employed to obtain an optimal solution to this convex SIP. Numerical illustrations on some benchmark models are included to show the efficacy of the algorithm.
4.An agent-based decentralized threshold policy finding the constrained shortest paths
Authors:Francesca Rosset, Raffaele Pesenti, Franco Blanchini
Abstract: We consider a problem where autonomous agents enter a dynamic and unknown environment described by a network of weighted arcs. These agents move within the network from node to node according to a decentralized policy using only local information, with the goal of finding a path to an unknown sink node to leave the network. This policy makes each agent move to some adjacent node or stop at the current node. The transition along an arc is allowed or denied based on a threshold mechanism that takes into account the number of agents already accumulated in the arc's end nodes and the arc's weight. We show that this policy ensures path-length optimality in the sense that, in a finite time, all new agents entering the network reach the closer sinks by the shortest paths. Our approach is later extended to support constraints on the paths that agents can follow.
5.On the Computation-Communication Trade-Off with A Flexible Gradient Tracking Approach
Authors:Yan Huang, Jinming Xu
Abstract: We propose a flexible gradient tracking approach with adjustable computation and communication steps for solving distributed stochastic optimization problem over networks. The proposed method allows each node to perform multiple local gradient updates and multiple inter-node communications in each round, aiming to strike a balance between computation and communication costs according to the properties of objective functions and network topology in non-i.i.d. settings. Leveraging a properly designed Lyapunov function, we derive both the computation and communication complexities for achieving arbitrary accuracy on smooth and strongly convex objective functions. Our analysis demonstrates sharp dependence of the convergence performance on graph topology and properties of objective functions, highlighting the trade-off between computation and communication. Numerical experiments are conducted to validate our theoretical findings.
6.Analysis of the vanishing discount limit for optimal control problems in continuous and discrete time
Authors:Piermarco Cannarsa, Stephane Gaubert, Cristian Mendico, Marc Quincampoix
Abstract: A classical problem in ergodic continuous time control consists of studying the limit behavior of the optimal value of a discounted cost functional with infinite horizon as the discount factor $\lambda$ tends to zero. In the literature, this problem has been addressed under various controllability or ergodicity conditions ensuring that the rescaled value function converges uniformly to a constant limit. In this case the limit can be characterized as the unique constant such that a suitable Hamilton-Jacobi equation has at least one continuous viscosity solution. In this paper, we study this problem without such conditions, so that the aforementioned limit needs not be constant. Our main result characterizes the uniform limit (when it exists) as the maximal subsolution of a system of Hamilton-Jacobi equations. Moreover, when such a subsolution is a viscosity solution, we obtain the convergence of optimal values as well as a rate of convergence. This mirrors the analysis of the discrete time case, where we characterize the uniform limit as the supremum over a set of sub-invariant half-lines of the dynamic programming operator. The emerging structure in both discrete and continuous time models shows that the supremum over sub-invariato half-lines with respect to the Lax-Oleinik semigroup/dynamic programming operator, captures the behavior of the limit cost as discount vanishes.
1.An Accelerated Stochastic ADMM for Nonconvex and Nonsmooth Finite-Sum Optimization
Authors:Yuxuan Zeng, Zhiguo Wang, Jianchao Bai, Xiaojing Shen
Abstract: The nonconvex and nonsmooth finite-sum optimization problem with linear constraint has attracted much attention in the fields of artificial intelligence, computer, and mathematics, due to its wide applications in machine learning and the lack of efficient algorithms with convincing convergence theories. A popular approach to solve it is the stochastic Alternating Direction Method of Multipliers (ADMM), but most stochastic ADMM-type methods focus on convex models. In addition, the variance reduction (VR) and acceleration techniques are useful tools in the development of stochastic methods due to their simplicity and practicability in providing acceleration characteristics of various machine learning models. However, it remains unclear whether accelerated SVRG-ADMM algorithm (ASVRG-ADMM), which extends SVRG-ADMM by incorporating momentum techniques, exhibits a comparable acceleration characteristic or convergence rate in the nonconvex setting. To fill this gap, we consider a general nonconvex nonsmooth optimization problem and study the convergence of ASVRG-ADMM. By utilizing a well-defined potential energy function, we establish its sublinear convergence rate $O(1/T)$, where $T$ denotes the iteration number. Furthermore, under the additional Kurdyka-Lojasiewicz (KL) property which is less stringent than the frequently used conditions for showcasing linear convergence rates, such as strong convexity, we show that the ASVRG-ADMM sequence has a finite length and converges to a stationary solution with a linear convergence rate. Several experiments on solving the graph-guided fused lasso problem and regularized logistic regression problem validate that the proposed ASVRG-ADMM performs better than the state-of-the-art methods.
2.Robust Data-driven Prescriptiveness Optimization
Authors:Mehran Poursoltani, Erick Delage, Angelos Georghiou
Abstract: The abundance of data has led to the emergence of a variety of optimization techniques that attempt to leverage available side information to provide more anticipative decisions. The wide range of methods and contexts of application have motivated the design of a universal unitless measure of performance known as the coefficient of prescriptiveness. This coefficient was designed to quantify both the quality of contextual decisions compared to a reference one and the prescriptive power of side information. To identify policies that maximize the former in a data-driven context, this paper introduces a distributionally robust contextual optimization model where the coefficient of prescriptiveness substitutes for the classical empirical risk minimization objective. We present a bisection algorithm to solve this model, which relies on solving a series of linear programs when the distributional ambiguity set has an appropriate nested form and polyhedral structure. Studying a contextual shortest path problem, we evaluate the robustness of the resulting policies against alternative methods when the out-of-sample dataset is subject to varying amounts of distribution shift.
3.Lifting partial smoothing to solve HJB equations and stochastic control problems
Authors:Fausto Gozzi, Federica Masiero
Abstract: We study a family of stochastic control problems arising in typical applications (such as boundary control and control of delay equations with delay in the control) with the ultimate aim of finding solutions of the associated HJB equations, regular enough to find optimal feedback controls. These problems are difficult to treat since the underlying transition semigroups do not possess good smoothing properties nor the so-called "structure condition" which typically allows to apply the backward equations approach. In the papers [14], [15], and, more recently, [16] we studied such problems developing new partial smoothing techniques which allowed us to obtain the required regularity in the case when the cost functional is independent of the state variable. This is a somehow strong restriction which is not verified in most applications. In this paper (which can be considered a continuation of the research of the above papers) we develop a new approach to overcome this restriction. We extend the partial smoothing result to a wider class of functions which depend on the whole trajectory of the underlying semigroup and we use this as a key tool to improve our regularity result for the HJB equation. The fact that such class depends on trajectories requires a nontrivial technical work as we have to lift the original transition semigroup to a space of trajectories, defining a new "high-level" environment where our problems can be solved.
4.Branching via Cutting Plane Selection: Improving Hybrid Branching
Authors:Mark Turner, Timo Berthold, Mathieu Besançon, Thorsten Koch
Abstract: Cutting planes and branching are two of the most important algorithms for solving mixed-integer linear programs. For both algorithms, disjunctions play an important role, being used both as branching candidates and as the foundation for some cutting planes. We relate branching decisions and cutting planes to each other through the underlying disjunctions that they are based on, with a focus on Gomory mixed-integer cuts and their corresponding split disjunctions. We show that selecting branching decisions based on quality measures of Gomory mixed-integer cuts leads to relatively small branch-and-bound trees, and that the result improves when using cuts that more accurately represent the branching decisions. Finally, we show how the history of previously computed Gomory mixed-integer cuts can be used to improve the performance of the state-of-the-art hybrid branching rule of SCIP. Our results show a 4\% decrease in solve time, and an 8\% decrease in number of nodes over affected instances of MIPLIB 2017.
1.Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates
Authors:Siqi Zhang, Sayantan Choudhury, Sebastian U Stich, Nicolas Loizou
Abstract: Distributed and federated learning algorithms and techniques associated primarily with minimization problems. However, with the increase of minimax optimization and variational inequality problems in machine learning, the necessity of designing efficient distributed/federated learning approaches for these problems is becoming more apparent. In this paper, we provide a unified convergence analysis of communication-efficient local training methods for distributed variational inequality problems (VIPs). Our approach is based on a general key assumption on the stochastic estimates that allows us to propose and analyze several novel local training algorithms under a single framework for solving a class of structured non-monotone VIPs. We present the first local gradient descent-accent algorithms with provable improved communication complexity for solving distributed variational inequalities on heterogeneous data. The general algorithmic framework recovers state-of-the-art algorithms and their sharp convergence guarantees when the setting is specialized to minimization or minimax optimization problems. Finally, we demonstrate the strong performance of the proposed algorithms compared to state-of-the-art methods when solving federated minimax optimization problems.
2.Zero-sum stopper vs. singular-controller games with constrained control directions
Authors:Andrea Bovo, Tiziano De Angelis, Jan Palczewski
Abstract: We consider a class of zero-sum stopper vs.\ singular-controller games in which the controller can only act on a subset $d_0<d$ of the $d$ coordinates of a controlled diffusion. Due to the constraint on the control directions these games fall outside the framework of recently studied variational methods. In this paper we develop an approximation procedure, based on $L^1$-stability estimates for the controlled diffusion process and almost sure convergence of suitable stopping times. That allows us to prove existence of the game's value and to obtain an optimal strategy for the stopper, under continuity and growth conditions on the payoff functions. This class of games is a natural extension of (single-agent) singular control problems, studied in the literature, with similar constraints on the admissible controls.
3.On the Identification and Optimization of Nonsmooth Superposition Operators in Semilinear Elliptic PDEs
Authors:Constantin Christof, Julia Kowalczyk
Abstract: We study an infinite-dimensional optimization problem that aims to identify the Nemytskii operator in the nonlinear part of a prototypical semilinear elliptic partial differential equation (PDE) which minimizes the distance between the PDE-solution and a given desired state. In contrast to previous works, we consider this identification problem in a low-regularity regime in which the function inducing the Nemytskii operator is a-priori only known to be an element of $H^1_{loc}(\mathbb{R})$. This makes the studied problem class a suitable point of departure for the rigorous analysis of training problems for learning-informed PDEs in which an unknown superposition operator is approximated by means of a neural network with nonsmooth activation functions (ReLU, leaky-ReLU, etc.). We establish that, despite the low regularity of the controls, it is possible to derive a classical stationarity system for local minimizers and to solve the considered problem by means of a gradient projection method. The convergence of the resulting algorithm is proven in the function space setting. It is also shown that the established first-order necessary optimality conditions imply that locally optimal superposition operators share various characteristic properties with commonly used activation functions: They are always sigmoidal, continuously differentiable away from the origin, and typically possess a distinct kink at zero. The paper concludes with numerical experiments which confirm the theoretical findings.
4.Safe Adaptive Multi-Agent Coverage Control
Authors:Yang Bai, Yujie Wang, Xiaogang Xiong, Mikhail Svinin
Abstract: This paper presents a safe adaptive coverage controller for multi-agent systems with actuator faults and time-varying uncertainties. The centroidal Voronoi tessellation (CVT) is applied to generate an optimal configuration of multi-agent systems for covering an area of interest. As a conventional CVT-based controller cannot prevent collisions between agents with non-zero size, a control barrier function (CBF) based controller is developed to ensure collision avoidance with a function approximation technique (FAT) based design to deal with system uncertainties. The proposed controller is verified under simulations.
1.End-to-End Learning for Stochastic Optimization: A Bayesian Perspective
Authors:Yves Rychener, Daniel Kuhn Tobias Sutter
Abstract: We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this analysis, we then propose new end-to-end learning algorithms for training decision maps that output solutions of empirical risk minimization and distributionally robust optimization problems, two dominant modeling paradigms in optimization under uncertainty. Numerical results for a synthetic newsvendor problem illustrate the key differences between alternative training schemes. We also investigate an economic dispatch problem based on real data to showcase the impact of the neural network architecture of the decision maps on their test performance.
2.Two-step inertial Bregman alternating structure-adapted proximal gradient descent algorithm for nonconvex and nonsmooth problems
Authors:Chenzheng Guo, Jing Zhao
Abstract: In the paper, we introduce several accelerate iterative algorithms for solving the multiple-set split common fixed-point problem of quasi-nonexpansive operators in real Hilbert space. Based on primal-dual method, we construct several iterative algorithms in a way that combines inertial technology and the self-adaptive stepsize such that the implementation of the algorithms doesn't need any prior information about bounded linear operator norm. Under suitable assumptions, weak convergence of the proposed algorithms is established. As applications, we obtain relative iterative algorithms to solve the multiple-set split feasibility problem. Finally, the performance of the proposed algorithms is illustrated by numerical experiments.
3.Input Rate Control in Stochastic Road Traffic Networks: Effective Bandwidths
Authors:Nikki Levering, Rudesindo Núñez-Queija
Abstract: In road traffic networks, large traffic volumes may lead to extreme delays. These severe delays are caused by the fact that, whenever the maximum capacity of a road is approached, speeds drop rapidly. Therefore, the focus in this paper is on real-time control of traffic input rates, thereby aiming to prevent such detrimental capacity drops. To account for the fact that, by the heterogeneity within and between traffic streams, the available capacity of a road suffers from randomness, we introduce a stochastic flow model that describes the impact of traffic input streams on the available road capacities. Then, exploiting similarities with traffic control of telecommunication networks, in which the available bandwidth is a stochastic function of the input rate, and in which the use of effective bandwidths have proven an effective input rate control framework, we propose a similar traffic rate control policy based on the concept of effective bandwidths. This policy allows for increased waiting times at the access boundaries of the network, so as to limit the probability of large delays within the network. Numerical examples show that, by applying such a control policy capacity violations are indeed rare, and that the increased waiting at the boundaries of the network is of much smaller scale, compared to uncontrolled delays in the network.
4.A Decomposition Approach to Last Mile Delivery Using Public Transportation Systems
Authors:Minakshi Punam Mandal, Claudia Archetti
Abstract: This study explores the potential of using public transportation systems for freight delivery, where we intend to utilize the spare capacities of public vehicles like buses, trams, metros, and trains, particularly during off-peak hours, to transport packages within the city instead of using dedicated delivery vehicles. The study contributes {to the growing} literature on innovative strategies for performing sustainable last mile deliveries. We study an operational level problem called the Three-Tier Delivery Problem on Public Transportation, where packages are first transported from the Consolidation and Distribution Center (CDC) to nearby public vehicle stations by delivery trucks. From there, public vehicles transport them into the city area. The last leg of the delivery is performed to deliver the packages to their respective customers using green vehicles or eco-friendly systems. We propose mixed-integer linear programming formulations to study the transport of packages from the CDC to the customers, use decomposition approaches to solve them, and provide numerical experiments to demonstrate the efficiency and effectiveness of the system. Our results show that this system has the potential to drastically reduce the length of trips performed by dedicated delivery vehicles, thereby reducing the negative social and environmental impacts of existing last mile delivery systems.
5.Distributed accelerated proximal conjugate gradient methods for multi-agent constrained optimization problems
Authors:Anteneh Getachew Gebrie
Abstract: The purpose of this paper is to introduce two new classes of accelerated distributed proximal conjugate gradient algorithms for multi-agent constrained optimization problems; given as minimization of a function decomposed as a sum of M number of smooth and M number of nonsmooth functions over the common fixed points of M number of nonlinear mappings. Exploiting the special properties of the cost component function of the objective function and the nonlinear mapping of the constraint problem of each agent, a new inertial accelerated incremental and parallel computing distributed algorithms will be presented based on the combinations of computations of proximal, conjugate gradient and Halpern methods. Some numerical experiments and comparisons are given to illustrate our results.
6.A Hierarchical OPF Algorithm with Improved Gradient Evaluation in Three-Phase Networks
Authors:Heng Liang, Xinyang Zhou, Changhong Zhao
Abstract: Linear approximation commonly used in solving alternating-current optimal power flow (AC-OPF) simplifies the system models but incurs accumulated voltage errors in large power networks. Such errors will make the primal-dual type gradient algorithms converge to the solutions at which the power networks may be exposed to the risk of voltage violation. In this paper, we improve a recent hierarchical OPF algorithm that rested on primal-dual gradients evaluated with a linearized distribution power flow model. Specifically, we propose a more accurate gradient evaluation method based on a three-phase unbalanced nonlinear distribution power flow model to mitigate the errors arising from model linearization. The resultant gradients feature a blocked structure that enables us to further develop an improved hierarchical primal-dual algorithm to solve the OPF problem. Numerical results on the IEEE $123$-bus test feeder and a $4,518$-node test feeder show that the proposed method can enhance the overall voltage safety while achieving comparable computational efficiency with the linearized algorithm.
7.Comparison of SeDuMi and SDPT3 Solvers for Stability of Continuous-time Linear System
Authors:Guangda Xu
Abstract: SeDuMi and SDPT3 are two solvers for solving Semi-definite Programming (SDP) or Linear Matrix Inequality (LMI) problems. A computational performance comparison of these two are undertaken in this paper regarding the Stability of Continuous-time Linear Systems. The comparison mainly focuses on computational times and memory requirements for different scales of problems. To implement and compare the two solvers on a set of well-posed problems, we employ YALMIP, a widely used toolbox for modeling and optimization in MATLAB. The primary goal of this study is to provide an empirical assessment of the relative computational efficiency of SeDuMi and SDPT3 under varying problem conditions. Our evaluation indicates that SDPT3 performs much better in large-scale, high-precision calculations.
8.The lifted functional approach to mean field games with common noise
Authors:Mark Cerenzia, Aaron Palmer
Abstract: We introduce a new path-by-path approach to mean field games with common noise that recovers duality at the pathwise level. We verify this perspective by explicitly solving some difficult examples with linear-quadratic data, including control in the volatility coefficient of the common noise as well as the constraint of partial information. As an application, we establish the celebrated separation principle in the latter context. In pursuing this program, we believe we have made a crucial contribution to clarifying the notion of regular solution in the path dependent PDE literature.
1.New Relaxation Modulus Based Iterative Method for Large and Sparse Implicit Complementarity Problem
Authors:Bharat Kumar, Deepmala, A. K. Das
Abstract: This article presents a class of new relaxation modulus-based iterative methods to process the large and sparse implicit complementarity problem (ICP). Using two positive diagonal matrices, we formulate a fixed-point equation and prove that it is equivalent to ICP. Also, we provide sufficient convergence conditions for the proposed methods when the system matrix is a $P$-matrix or an $H_+$-matrix. Keyword: Implicit complementarity problem, $H_{+}$-matrix, $P$-matrix, matrix splitting, convergence
2.Weak KAM Theory and Aubry-Mather Theory for sub-Riemannian control systems
Authors:Piermarco Cannarsa, Cristian Mendico
Abstract: The aim of this work is to provide a systemic study and generalization of the celebrated weak KAM theory and Aubry-Mather theory in sub-Riemannian setting, or equivalently, on a Carnot-Caratheodory metric space. In this framework we consider an optimal control problem with state equation of sub-Riemannian type, namely, admissible trajectories are solutions of a linear in control and nonlinear in space ODE. Such a nonlinearity is given by a family of smooth vector fields satisfying the Hormander condition which implies the controllability of the system. In this case, the Hamiltonian function associated with the above control problem fails to be coercive and thus the results in the Tonelli setting can not be applied. In order to overcome this issue, our approach is based on metric properties of the geometry induced on the state space by the sub-Riemannian structure.
3.Characterization of transport optimizers via graphs and applications to Stackelberg-Cournot-Nash equilibria
Authors:Beatrice Acciaio, Berenice Anne Neumann
Abstract: We introduce graphs associated to transport problems between discrete marginals, that allow to characterize the set of all optimizers given one primal optimizer. In particular, we establish that connectivity of those graphs is a necessary and sufficient condition for uniqueness of the dual optimizers. Moreover, we provide an algorithm that can efficiently compute the dual optimizer that is the limit, as the regularization parameter goes to zero, of the dual entropic optimizers. Our results find an application in a Stackelberg-Cournot-Nash game, for which we obtain existence and characterization of the equilibria.
1.On the convergence of the $k$-point bound for topological packing graphs
Authors:Bram Bekker, Fernando Mário de Oliveira Filho
Abstract: We show that the $k$-point bound of de Laat, Machado, Oliveira, and Vallentin, a hierarchy of upper bounds for the independence number of a topological packing graph derived from the Lasserre hierarchy, converges to the independence number.
2.On the Split Closure of the Periodic Timetabling Polytope
Authors:Niels Lindner, Berenike Masing
Abstract: The Periodic Event Scheduling Problem (PESP) is the central mathematical tool for periodic timetable optimization in public transport. PESP can be formulated in several ways as a mixed-integer linear program with typically general integer variables. We investigate the split closure of these formulations and show that split inequalities are identical with the recently introduced flip inequalities. While split inequalities are a general mixed-integer programming technique, flip inequalities are defined in purely combinatorial terms, namely cycles and arc sets of the digraph underlying the PESP instance. It is known that flip inequalities can be separated in pseudo-polynomial time. We prove that this is best possible unless P $=$ NP, but also observe that the complexity becomes linear-time if the cycle defining the flip inequality is fixed. Moreover, introducing mixed-integer-compatible maps, we compare the split closures of different formulations, and show that reformulation or binarization by subdivision do not lead to stronger split closures. Finally, we estimate computationally how much of the optimality gap of the instances of the benchmark library PESPlib can be closed exclusively by split cuts, and provide better dual bounds for five instances.
3.Tight Big-Ms for Optimal Transmission Switching
Authors:Salvador Pineda, Juan Miguel Morales, Álvaro Porras, Concepción Domínguez
Abstract: This paper addresses the Optimal Transmission Switching (OTS) problem in electricity networks, which aims to find an optimal power grid topology that minimizes system operation costs while satisfying physical and operational constraints. Existing methods typically convert the OTS problem into a Mixed-Integer Linear Program (MILP) using big-M constants. However, the computational performance of these approaches relies significantly on the tightness of these big-Ms. In this paper, we propose an iterative tightening strategy to strengthen the big-Ms by efficiently solving a series of bounding problems that account for the economics of the OTS objective function through an upper-bound on the generating cost. We also discuss how the performance of the proposed tightening strategy is enhanced if reduced line capacities are considered. Using the 118-bus test system we demonstrate that the proposed methodology outperforms existing approaches, offering tighter bounds and significantly reducing the computational burden of the OTS problem.
4.Integer Programming Games: A Gentle Computational Overview
Authors:Margarida Carvalho, Gabriele Dragotto, Andrea Lodi, Sriram Sankaranarayan
Abstract: In this tutorial, we present a computational overview on computing Nash equilibria in Integer Programming Games ($IPG$s), $i.e.$, how to compute solutions for a class of non-cooperative and nonconvex games where each player solves a mixed-integer optimization problem. $IPG$s are a broad class of games extending the modeling power of mixed-integer optimization to multi-agent settings. This class of games includes, for instance, any finite game and any multi-agent extension of traditional combinatorial optimization problems. After providing some background motivation and context of applications, we systematically review and classify the state-of-the-art algorithms to compute Nash equilibria. We propose an essential taxonomy of the algorithmic ingredients needed to compute equilibria, and we describe the theoretical and practical challenges associated with equilibria computation. Finally, we quantitatively and qualitatively compare a sequential Stackelberg game with a simultaneous $IPG$ to highlight the different properties of their solutions.
5.Probabilistic Region-of-Attraction Estimation with Scenario Optimization and Converse Theorems
Authors:Torbjørn Cunis
Abstract: The region of attraction characterizes well-behaved and safe operation of a nonlinear system and is hence sought after for verification. In this paper, a framework for probabilistic region of attraction estimation is developed that combines scenario optimization and converse theorems. With this approach, the probability of an unstable condition being included in the estimate is independent of the system's complexity, while convergence in probability to the true region of attraction is proven. Numerical examples demonstrate the effectiveness for optimization-based control applications. Combining systems theory and sampling, the complexity of Monte--Carlo-based verification techniques can be reduced. The results can be extended to arbitrary level sets of which the defining function can be sampled, such as finite-horizon viability. Thus, the proposed approach is applicable and/or adaptable to verification of a wide range of safety-related properties for nonlinear systems including feedback laws based on optimization or learning.
6.Exact Two-Step Benders Decomposition for Two-Stage Stochastic Mixed-Integer Programs
Authors:Sifa Celik, Layla Martin, Albert H. Schrotenboer, Tom Van Woensel
Abstract: Many real-life optimization problems belong to the class of two-stage stochastic mixed-integer programming problems with continuous recourse. This paper introduces Two-Step Benders Decomposition with Scenario Clustering (TBDS) as a general exact solution methodology for solving such stochastic programs to optimality. The method combines and generalizes Benders dual decomposition, partial Benders decomposition, and Scenario Clustering techniques and does so within a novel two-step decomposition along the binary and continuous first-stage decisions. We use TBDS to provide the first exact solutions for the so-called Time Window Assignment Traveling Salesperson problem. This is a canonical optimization problem for service-oriented vehicle routing; it considers jointly assigning time windows to customers and routing a vehicle among them while travel times are stochastic. Extensive experiments show that TBDS is superior to state-of-the-art approaches in the literature. It solves instances with up to 25 customers to optimality. It provides better lower and upper bounds that lead to faster convergence than related methods. For example, Benders dual decomposition cannot solve instances of 10 customers to optimality. We use TBDS to analyze the structure of the optimal solutions. By increasing routing costs only slightly, customer service can be improved tremendously, driven by smartly alternating between high- and low-variance travel arcs to reduce the impact of delay propagation throughout the executed vehicle route.
7.Curvature and complexity: Better lower bounds for geodesically convex optimization
Authors:Christopher Criscitiello, Nicolas Boumal
Abstract: We study the query complexity of geodesically convex (g-convex) optimization on a manifold. To isolate the effect of that manifold's curvature, we primarily focus on hyperbolic spaces. In a variety of settings (smooth or not; strongly g-convex or not; high- or low-dimensional), known upper bounds worsen with curvature. It is natural to ask whether this is warranted, or an artifact. For many such settings, we propose a first set of lower bounds which indeed confirm that (negative) curvature is detrimental to complexity. To do so, we build on recent lower bounds (Hamilton and Moitra, 2021; Criscitiello and Boumal, 2022) for the particular case of smooth, strongly g-convex optimization. Using a number of techniques, we also secure lower bounds which capture dependence on condition number and optimality gap, which was not previously the case. We suspect these bounds are not optimal. We conjecture optimal ones, and support them with a matching lower bound for a class of algorithms which includes subgradient descent, and a lower bound for a related game. Lastly, to pinpoint the difficulty of proving lower bounds, we study how negative curvature influences (and sometimes obstructs) interpolation with g-convex functions.
8.Frequency Regulation with Storage: On Losses and Profits
Authors:Dirk Lauinger, François Vuille, Daniel Kuhn
Abstract: Low-carbon societies will need to store vast amounts of electricity to balance intermittent generation from wind and solar energy, for example, through frequency regulation. Here, we derive an analytical solution to the decision-making problem of storage operators who sell frequency regulation power to grid operators and trade electricity on day-ahead markets. Mathematically, we treat future frequency deviation trajectories as functional uncertainties in a receding horizon robust optimization problem. We constrain the expected terminal state-of-charge to be equal to some target to allow storage operators to make good decisions not only for the present but also the future. Thanks to this constraint, the amount of electricity traded on day-ahead markets is an implicit function of the regulation power sold to grid operators. The implicit function quantifies the amount of power that needs to be purchased to cover the expected energy loss that results from providing frequency regulation. We show how the marginal cost associated with the expected energy loss decreases with roundtrip efficiency and increases with frequency deviation dispersion. We find that the profits from frequency regulation over the lifetime of energy-constrained storage devices are roughly inversely proportional to the length of time for which regulation power must be committed.
9.Explicit feedback synthesis driven by quasi-interpolation for nonlinear robust model predictive control
Authors:Siddhartha Ganguly, Debasish Chatterjee
Abstract: We present QuIFS (Quasi-Interpolation driven Feedback Synthesis) -- an offline feedback synthesis algorithm for explicit nonlinear robust minmax model predictive control (MPC) problems with guaranteed quality of approximation. The underlying technique is driven by a particular type of grid-based quasi-interpolation scheme. The QuIFS algorithm departs drastically from conventional approximation algorithms that are employed in the MPC industry (in particular, it is neither based on multi-parametric programming tools nor does it involve kernel methods), and the essence of their point of departure is encoded in the following challenge-answer approach: Given an error margin $\varepsilon>0$, compute a feasible feedback policy that is uniformly $\varepsilon$-close to the optimal MPC feedback policy for a given nonlinear system subjected to hard constraints and bounded uncertainties. Conditions for closed-loop stability and recursive feasibility under the approximate feedback policy are also established. We provide a library of numerical examples to illustrate our results.
10.Entropic mean-field min-max problems via Best Response and Fisher-Rao flows
Authors:Razvan-Andrei Lascu, Mateusz B. Majka, Łukasz Szpruch
Abstract: We investigate convergence properties of two continuous-time optimization methods, the Mean-Field Best Response and the Fisher-Rao (Mean-Field Birth-Death) flows, for solving convex-concave min-max games with entropy regularization. We introduce suitable Lyapunov functions to establish exponential convergence to the unique mixed Nash equilibrium for both methods, albeit under slightly different conditions. Additionally, we demonstrate the convergence of the fictitious play flow as a by-product of our analysis.
1.Optimal Control and Approximate controllability of fractional semilinear differential inclusion involving $ψ$- Hilfer fractional derivatives
Authors:Bholanath Kumbhakar, Dwijendra Narain Pandey
Abstract: The current paper initially studies the optimal control of linear $\psi$-Hilfer fractional derivatives with state-dependent control constraints and optimal control for a particular type of cost functional. Then, we investigate the approximate controllability of the abstract fractional semilinear differential inclusion involving $\psi$-Hilfer fractional derivative in reflexive Banach spaces. It is known that the existence, uniqueness, optimal control, and approximate controllability of fractional differential equations or inclusions have been demonstrated for a similar type of fractional differential equations or inclusions with different fractional order derivative operators. Hence it has to research fractional differential equations with more general fractional operators which incorporate all the specific fractional derivative operators. This motivates us to consider the $\psi$-Hilfer fractional differential inclusion. We assume the compactness of the corresponding semigroup and the approximate controllability of the associated linear control system and define the control with the help of duality mapping. We observe that convexity is essential in determining the controllability property of semilinear differential inclusion. In the case of Hilbert spaces, there is no issue of convexity as the duality map becomes simply the identity map. In contrast to Hilbert spaces, if we consider reflexive Banach spaces, there is an issue of convexity due to the nonlinear nature of duality mapping. The novelty of this paper is that we overcome this convexity issue and establish our main result. Finally, we test our outcomes through an example.
2.A Study of Qualitative Correlations Between Crucial Bio-markers and the Optimal Drug Regimen of Type-I Lepra Reaction: A Deterministic Approach
Authors:Dinesh Nayak, A. V. Sangeetha, D. K. K. Vamsi
Abstract: Mycobacterium leprae is a bacteria that causes the disease Leprosy (Hansen's disease), which is a neglected tropical disease. More than 200000 cases are being reported per year world wide. This disease leads to a chronic stage known as Lepra reaction that majorly causes nerve damage of peripheral nervous system leading to loss of organs. The early detection of this Lepra reaction through the level of bio-markers can prevent this reaction occurring and the further disabilities. Motivated by this, we frame a mathematical model considering the pathogenesis of leprosy and the chemical pathways involved in Lepra reactions. The model incorporates the dynamics of the susceptible schwann cells, infected schwann cells and the bacterial load and the concentration levels of the bio markers $IFN-\gamma$, $TNF-\alpha$, $IL-10$, $IL-12$, $IL-15$ and $IL-17$. We consider a nine compartment optimal control problem considering the drugs used in Multi Drug Therapy (MDT) as controls. We validate the model using 2D - heat plots. We study the correlation between the bio-markers levels and drugs in MDT and propose an optimal drug regimen through these optimal control studies. We use the Newton's Gradient Method for the optimal control studies.
3.The uniform diversification strategy is optimal for expected utility maximization under high model ambiguity
Authors:Laurence Carassus, Johannes Wiesel
Abstract: We investigate an expected utility maximization problem under model uncertainty in a one-period financial market. We capture model uncertainty by replacing the baseline model $\mathbb{P}$ with an adverse choice from a Wasserstein ball of radius $k$ around $\mathbb{P}$ in the space of probability measures and consider the corresponding Wasserstein distributionally robust optimization problem. We show that optimal solutions converge to the uniform diversification strategy when uncertainty is increasingly large, i.e. when the radius $k$ tends to infinity.
4.Load Asymptotics and Dynamic Speed Optimization for the Greenest Path Problem: A Comprehensive Analysis
Authors:Poulad Moradi, Joachim Arts, Josué Velázquez-Martínez
Abstract: We study the effect of using high-resolution elevation data on the selection of the most fuel-efficient (greenest) path for different trucks in various urban environments. We adapt a variant of the Comprehensive Modal Emission Model (CMEM) to show that the optimal speed and the greenest path are slope dependent (dynamic). When there are no elevation changes in a road network, the most fuel-efficient path is the shortest path with a constant (static) optimal speed throughout. However, if the network is not flat, then the shortest path is not necessarily the greenest path, and the optimal driving speed is dynamic. We prove that the greenest path converges to an asymptotic greenest path as the payload approaches infinity and that this limiting path is attained for a finite load. In a set of extensive numerical experiments, we benchmark the CO2 emissions reduction of our dynamic speed and the greenest path policies against policies that ignore elevation data. We use the geo-spatial data of 25 major cities across 6 continents, such as Los Angeles, Mexico City, Johannesburg, Athens, Ankara, and Canberra. Our results show that, on average, traversing the greenest path with a dynamic optimal speed policy can reduce the CO2 emissions by 1.19% to 10.15% depending on the city and truck type for a moderate payload. They also demonstrate that the average CO2 reduction of the optimal dynamic speed policy is between 2% to 4% for most of the cities, regardless of the truck type. We confirm that disregarding elevation data yields sub-optimal paths that are significantly less CO2 efficient than the greenest paths.
1.The Mini-batch Stochastic Conjugate Algorithms with the unbiasedness and Minimized Variance Reduction
Authors:Feifei Gao, Caixia Kou
Abstract: We firstly propose the new stochastic gradient estimate of unbiasedness and minimized variance in this paper. Secondly, we propose the two algorithms: Algorithml and Algorithm2 which apply the new stochastic gradient estimate to modern stochastic conjugate gradient algorithms SCGA 7and CGVR 8. Then we prove that the proposed algorithms can obtain linearconvergence rate under assumptions of strong convexity and smoothness. Finally, numerical experiments show that the new stochastic gradient estimatecan reduce variance of stochastic gradient effectively. And our algorithms compared SCGA and CGVR can convergent faster in numerical experimentson ridge regression model.
2.Optimization Algorithm Synthesis based on Integral Quadratic Constraints: A Tutorial
Authors:Carsten W. Scherer, Christian Ebenbauer, Tobias Holicki
Abstract: We expose in a tutorial fashion the mechanisms which underly the synthesis of optimization algorithms based on dynamic integral quadratic constraints. We reveal how these tools from robust control allow to design accelerated gradient descent algorithms with optimal guaranteed convergence rates by solving small-sized convex semi-definite programs. It is shown that this extends to the design of extremum controllers, with the goal to regulate the output of a general linear closed-loop system to the minimum of an objective function. Numerical experiments illustrate that we can not only recover gradient decent and the triple momentum variant of Nesterov's accelerated first order algorithm, but also automatically synthesize optimal algorithms even if the gradient information is passed through non-trivial dynamics, such as time-delays.
3.Robust Exponential Stability and Invariance Guarantees with General Dynamic O'Shea-Zames-Falb Multipliers
Authors:Carsten W. Scherer
Abstract: We propose novel time-domain dynamic integral quadratic constraints with a terminal cost for exponentially weighted slope-restricted gradients of not necessarily convex functions. This extends recent results for subdifferentials of convex function and their link to so-called O'Shea-Zames-Falb multipliers. The benefit of merging time-domain and frequency-domain techniques is demonstrated for linear saturated systems.
4.Data-driven optimal control under safety constraints using sparse Koopman approximation
Authors:Hongzhe Yu, Joseph Moyalan, Umesh Vaidya, Yongxin Chen
Abstract: In this work we approach the dual optimal reach-safe control problem using sparse approximations of Koopman operator. Matrix approximation of Koopman operator needs to solve a least-squares (LS) problem in the lifted function space, which is computationally intractable for fine discretizations and high dimensions. The state transitional physical meaning of the Koopman operator leads to a sparse LS problem in this space. Leveraging this sparsity, we propose an efficient method to solve the sparse LS problem where we reduce the problem dimension dramatically by formulating the problem using only the non-zero elements in the approximation matrix with known sparsity pattern. The obtained matrix approximation of the operators is then used in a dual optimal reach-safe problem formulation where a linear program with sparse linear constraints naturally appears. We validate our proposed method on various dynamical systems and show that the computation time for operator approximation is greatly reduced with high precision in the solutions.
5.Gauss-Southwell type descent methods for low-rank matrix optimization
Authors:Guillaume Olikier, André Uschmajew, Bart Vandereycken
Abstract: We consider gradient-related methods for low-rank matrix optimization with a smooth cost function. The methods operate on single factors of the low-rank factorization and share aspects of both alternating and Riemannian optimization. Two possible choices for the search directions based on Gauss-Southwell type selection rules are compared: one using the gradient of a factorized non-convex formulation, the other using the Riemannian gradient. While both methods provide gradient convergence guarantees that are similar to the unconstrained case, the version based on Riemannian gradient is significantly more robust with respect to small singular values and the condition number of the cost function, as illustrated by numerical experiments. As a side result of our approach, we also obtain new convergence results for the alternating least squares method.
6.Mean-field limit for stochastic control problems under state constraint
Authors:Samuel Daudin
Abstract: We study the convergence problem of mean-field control theory in the presence of state constraints and non-degenerate idiosyncratic noise. Our main result is the convergence of the value functions associated to stochastic control problems for many interacting particles subject to symmetric, almost-sure constraints toward the value function of a control problem of mean-field type, set on the space of probability measures. The key step of the proof is to show that admissible controls for the limit problem can be turned into admissible controls for the $N$-particle problem up to a correction which vanishes as the number of particles increases. The rest of the proof relies on compactness methods. We also provide optimality conditions for the mean-field problem and discuss the regularity of the optimal controls. Finally we present some applications and connections with large deviations for weakly interacting particle systems.
1.On the Linear Convergence of Policy Gradient under Hadamard Parameterization
Authors:Jiacai Liu, Jinchi Chen, Ke Wei
Abstract: The convergence of deterministic policy gradient under the Hadamard parametrization is studied in the tabular setting and the global linear convergence of the algorithm is established. To this end, we first show that the error decreases at an $O(\frac{1}{k})$ rate for all the iterations. Based on this result, we further show that the algorithm has a faster local linear convergence rate after $k_0$ iterations, where $k_0$ is a constant that only depends on the MDP problem and the step size. Overall, the algorithm displays a linear convergence rate for all the iterations with a loose constant than that for the local linear convergence rate.
2.A converse Lyapunov-type theorem for control systems with regulated cost
Authors:Anna Chiara Lai, Monica Motta
Abstract: Given a nonlinear control system, a target set, a nonnegative integral cost, and a continuous function $W$, we say that the system is globally asymptotically controllable to the target with W-regulated cost, whenever, starting from any point z, among the strategies that achieve classical asymptotic controllability we can select one that also keeps the cost less than W(z). In this paper, assuming mild regularity hypotheses on the data, we prove that a necessary and sufficient condition for global asymptotic controllability with regulated cost is the existence of a special, continuous Control Lyapunov function, called a Minimum Restraint function. The main novelty is the necessity implication, obtained here for the first time. Nevertheless, the sufficiency condition extends previous results based on semiconcavity of the Minimum Restraint function, while we require mere continuity.
3.Bilevel Optimal Control: Theory, Algorithms, and Applications
Authors:Stephan Dempe, Markus Friedemann, Felix Harder, Patrick Mehlitz, Gerd Wachsmuth
Abstract: In this chapter, we are concerned with inverse optimal control problems, i.e., optimization models which are used to identify parameters in optimal control problems from given measurements. Here, we focus on linear-quadratic optimal control problems with control constraints where the reference control plays the role of the parameter and has to be reconstructed. First, it is shown that pointwise M-stationarity, associated with the reformulation of the hierarchical model as a so-called mathematical problem with complementarity constraints (MPCC) in function spaces, provides a necessary optimality condition under some additional assumptions on the data. Second, we review two recently developed algorithms (an augmented Lagrangian method and a nonsmooth Newton method) for the computational identification of M-stationary points of finite-dimensional MPCCs. Finally, a numerical comparison of these methods, based on instances of the appropriately discretized inverse optimal control problem of our interest, is provided.
4.Convergence of the vertical Gradient flow for the Gaussian Monge problem
Authors:Erik Jansson, Klas Modin
Abstract: We investigate a matrix dynamical system related to optimal mass transport in the linear category, namely, the problem of finding an optimal invertible matrix by which two covariance matrices are congruent. We first review the differential geometric structure of the problem in terms of a principal fiber bundle. The dynamical system is a gradient flow restricted to the fibers of the bundle. We prove global existence of solutions to the flow, with convergence to the polar decomposition of the matrix given as initial data. The convergence is illustrated in a numerical example.
5.A fresh look at nonsmooth Levenberg--Marquardt methods with applications to bilevel optimization
Authors:Lateef O. Jolaoso, Patrick Mehlitz, Alain B. Zemkoho
Abstract: In this paper, we revisit the classical problem of solving over-determined systems of nonsmooth equations numerically. We suggest a nonsmooth Levenberg--Marquardt method for its solution which, in contrast to the existing literature, does not require local Lipschitzness of the data functions. This is possible when using Newton-differentiability instead of semismoothness as the underlying tool of generalized differentiation. Conditions for fast local convergence of the method are given. Afterwards, in the context of over-determined mixed nonlinear complementarity systems, our findings are applied, and globalized solution methods, based on a residual induced by the maximum and the Fischer--Burmeister function, respectively, are constructed. The assumptions for fast local convergence are worked out and compared. Finally, these methods are applied for the numerical solution of bilevel optimization problems. We recall the derivation of a stationarity condition taking the shape of an over-determined mixed nonlinear complementarity system involving a penalty parameter, formulate assumptions for local fast convergence of our solution methods explicitly, and present results of numerical experiments. Particularly, we investigate whether the treatment of the appearing penalty parameter as an additional variable is beneficial or not.
6.Efficient PDE-Constrained optimization under high-dimensional uncertainty using derivative-informed neural operators
Authors:Dingcheng Luo, Thomas O'Leary-Roseberry, Peng Chen, Omar Ghattas
Abstract: We propose a novel machine learning framework for solving optimization problems governed by large-scale partial differential equations (PDEs) with high-dimensional random parameters. Such optimization under uncertainty (OUU) problems may be computational prohibitive using classical methods, particularly when a large number of samples is needed to evaluate risk measures at every iteration of an optimization algorithm, where each sample requires the solution of an expensive-to-solve PDE. To address this challenge, we propose a new neural operator approximation of the PDE solution operator that has the combined merits of (1) accurate approximation of not only the map from the joint inputs of random parameters and optimization variables to the PDE state, but also its derivative with respect to the optimization variables, (2) efficient construction of the neural network using reduced basis architectures that are scalable to high-dimensional OUU problems, and (3) requiring only a limited number of training data to achieve high accuracy for both the PDE solution and the OUU solution. We refer to such neural operators as multi-input reduced basis derivative informed neural operators (MR-DINOs). We demonstrate the accuracy and efficiency our approach through several numerical experiments, i.e. the risk-averse control of a semilinear elliptic PDE and the steady state Navier--Stokes equations in two and three spatial dimensions, each involving random field inputs. Across the examples, MR-DINOs offer $10^{3}$--$10^{7} \times$ reductions in execution time, and are able to produce OUU solutions of comparable accuracies to those from standard PDE based solutions while being over $10 \times$ more cost-efficient after factoring in the cost of construction.
7.Alternating Minimization for Regression with Tropical Rational Functions
Authors:Alex Dunbar, Lars Ruthotto
Abstract: We propose an alternating minimization heuristic for regression over the space of tropical rational functions with fixed exponents. The method alternates between fitting the numerator and denominator terms via tropical polynomial regression, which is known to admit a closed form solution. We demonstrate the behavior of the alternating minimization method experimentally. Experiments demonstrate that the heuristic provides a reasonable approximation of the input data. Our work is motivated by applications to ReLU neural networks, a popular class of network architectures in the machine learning community which are closely related to tropical rational functions.
1.Stochastic Control/Stopping Problem with Expectation Constraints
Authors:Erhan Bayraktar, Song Yao
Abstract: We study a stochastic control/stopping problem with a series of inequality-type and equality-type expectation constraints in a general non-Markovian framework. We demonstrate that the stochastic control/stopping problem with expectation constraints (CSEC) is independent of a specific probability setting and is equivalent to the constrained stochastic control/stopping problem in weak formulation (an optimization over joint laws of Brownian motion, state dynamics, diffusion controls and stopping rules on an enlarged canonical space). Using a martingale-problem formulation of controlled SDEs in spirit of \cite{Stroock_Varadhan}, we characterize the probability classes in weak formulation by countably many actions of canonical processes, and thus obtain the upper semi-analyticity of the CSEC value function. Then we employ a measurable selection argument to establish a dynamic programming principle (DPP) in weak formulation for the CSEC value function, in which the conditional expected costs act as additional states for constraint levels at the intermediate horizon. This article extends the results of \cite{Elk_Tan_2013b} to the expectation-constraint case. We extend our previous work \cite{OSEC_stopping} to the more complicated setting where the diffusion is controlled. Compared to that paper the topological properties of diffusion-control spaces and the corresponding measurability are more technically involved which complicate the arguments especially for the measurable selection for the super-solution side of DPP in the weak formulation.
2.Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization
Authors:Quanqi Hu, Zi-Hao Qiu, Zhishuai Guo, Lijun Zhang, Tianbao Yang
Abstract: In this paper, we consider non-convex multi-block bilevel optimization (MBBO) problems, which involve $m\gg 1$ lower level problems and have important applications in machine learning. Designing a stochastic gradient and controlling its variance is more intricate due to the hierarchical sampling of blocks and data and the unique challenge of estimating hyper-gradient. We aim to achieve three nice properties for our algorithm: (a) matching the state-of-the-art complexity of standard BO problems with a single block; (b) achieving parallel speedup by sampling $I$ blocks and sampling $B$ samples for each sampled block per-iteration; (c) avoiding the computation of the inverse of a high-dimensional Hessian matrix estimator. However, it is non-trivial to achieve all of these by observing that existing works only achieve one or two of these properties. To address the involved challenges for achieving (a, b, c), we propose two stochastic algorithms by using advanced blockwise variance-reduction techniques for tracking the Hessian matrices (for low-dimensional problems) or the Hessian-vector products (for high-dimensional problems), and prove an iteration complexity of $O(\frac{m\epsilon^{-3}\mathbb{I}(I<m)}{I\sqrt{I}} + \frac{m\epsilon^{-3}}{I\sqrt{B}})$ for finding an $\epsilon$-stationary point under appropriate conditions. We also conduct experiments to verify the effectiveness of the proposed algorithms comparing with existing MBBO algorithms.
3.Around a Farkas type Lemma
Authors:Nguyen Dinh, Miguel A. Goberna, M. Volle
Abstract: The first two authors of this paper asserted in Lemma 4 of "New Farkas-type constraint qualifications in convex infinite programming" (DOI: 10.1051/cocv:2007027) that a given reverse convex inequality is consequence of a given convex system satisfying the Farkas-Minkowski constraint qualification if and only if certain set depending on the data contains a particular point of the vertical axis. This paper identifies a hidden assumption in this reverse Farkas lemma which always holds in its applications to nontrivial optimization problems. Moreover, it shows that the statement remains valid when the Farkas-Minkowski constraint qualification fails by replacing the mentioned set by its closure. This hidden assumption is also characterized in terms of the data. Finally, the paper provides some applications to convex infinite systems and to convex infinite optimization problems.
4.Infinite-dimensional moment-SOS hierarchy for nonlinear partial differential equations
Authors:Didier Henrion, Maria Infusino, Salma Kuhlmann, Victor Vinnikov
Abstract: We formulate a class of nonlinear {evolution} partial differential equations (PDEs) as linear optimization problems on moments of positive measures supported on infinite-dimensional vector spaces. Using sums of squares (SOS) representations of polynomials in these spaces, we can prove convergence of a hierarchy of finite-dimensional semidefinite relaxations solving approximately these infinite-dimensional optimization problems. As an illustration, we report on numerical experiments for solving the heat equation subject to a nonlinear perturbation.
5.Global minimization of polynomial integral functionals
Authors:Giovanni Fantuzzi, Federico Fuentes
Abstract: We describe a `discretize-then-relax' strategy to globally minimize integral functionals over functions $u$ in a Sobolev space satisfying prescribed Dirichlet boundary conditions. The strategy applies whenever the integral functional depends polynomially on $u$ and its derivatives, even if it is nonconvex. The `discretize' step uses a bounded finite-element scheme to approximate the integral minimization problem with a convergent hierarchy of polynomial optimization problems over a compact feasible set, indexed by the decreasing size $h$ of the finite-element mesh. The `relax' step employs sparse moment-SOS relaxations to approximate each polynomial optimization problem with a hierarchy of convex semidefinite programs, indexed by an increasing relaxation order $\omega$. We prove that, as $\omega\to\infty$ and $h\to 0$, solutions of such semidefinite programs provide approximate minimizers that converge in $L^p$ to the global minimizer of the original integral functional if this is unique. We also report computational experiments that show our numerical strategy works well even when technical conditions required by our theoretical analysis are not satisfied.
6.Policy Gradient Algorithms for Robust MDPs with Non-Rectangular Uncertainty Sets
Authors:Mengmeng Li, Tobias Sutter, Daniel Kuhn
Abstract: We propose a policy gradient algorithm for robust infinite-horizon Markov Decision Processes (MDPs) with non-rectangular uncertainty sets, thereby addressing an open challenge in the robust MDP literature. Indeed, uncertainty sets that display statistical optimality properties and make optimal use of limited data often fail to be rectangular. Unfortunately, the corresponding robust MDPs cannot be solved with dynamic programming techniques and are in fact provably intractable. This prompts us to develop a projected Langevin dynamics algorithm tailored to the robust policy evaluation problem, which offers global optimality guarantees. We also propose a deterministic policy gradient method that solves the robust policy evaluation problem approximately, and we prove that the approximation error scales with a new measure of non-rectangularity of the uncertainty set. Numerical experiments showcase that our projected Langevin dynamics algorithm can escape local optima, while algorithms tailored to rectangular uncertainty fail to do so.
7.Adaptive Quasi-Newton and Anderson Acceleration Framework with Explicit Global (Accelerated) Convergence Rates
Authors:Damien Scieur
Abstract: Despite the impressive numerical performance of quasi-Newton and Anderson/nonlinear acceleration methods, their global convergence rates have remained elusive for over 50 years. This paper addresses this long-standing question by introducing a framework that derives novel and adaptive quasi-Newton or nonlinear/Anderson acceleration schemes. Under mild assumptions, the proposed iterative methods exhibit explicit, non-asymptotic convergence rates that blend those of gradient descent and Cubic Regularized Newton's method. Notably, these rates are achieved adaptively, as the method autonomously determines the optimal step size using a simple backtracking strategy. The proposed approach also includes an accelerated version that improves the convergence rate on convex functions. Numerical experiments demonstrate the efficiency of the proposed framework, even compared to a fine-tuned BFGS algorithm with line search.
8.Fast global convergence of gradient descent for low-rank matrix approximation
Authors:Hengchao Chen, Xin Chen, Mohamad Elmasri, Qiang Sun
Abstract: This paper investigates gradient descent for solving low-rank matrix approximation problems. We begin by establishing the local linear convergence of gradient descent for symmetric matrix approximation. Building on this result, we prove the rapid global convergence of gradient descent, particularly when initialized with small random values. Remarkably, we show that even with moderate random initialization, which includes small random initialization as a special case, gradient descent achieves fast global convergence in scenarios where the top eigenvalues are identical. Furthermore, we extend our analysis to address asymmetric matrix approximation problems and investigate the effectiveness of a retraction-free eigenspace computation method. Numerical experiments strongly support our theory. In particular, the retraction-free algorithm outperforms the corresponding Riemannian gradient descent method, resulting in a significant 29\% reduction in runtime.
9.Learning for Robust Optimization
Authors:Irina Wang, Cole Becker, Bart Van Parys, Bartolomeo Stellato
Abstract: We propose a data-driven technique to automatically learn the uncertainty sets in robust optimization. Our method reshapes the uncertainty sets by minimizing the expected performance across a family of problems while guaranteeing constraint satisfaction. We learn the uncertainty sets using a novel stochastic augmented Lagrangian method that relies on differentiating the solutions of the robust optimization problems with respect to the parameters of the uncertainty set. We show sublinear convergence to stationary points under mild assumptions, and finite-sample probabilistic guarantees of constraint satisfaction using empirical process theory. Our approach is very flexible and can learn a wide variety of uncertainty sets while preserving tractability. Numerical experiments show that our method outperforms traditional approaches in robust and distributionally robust optimization in terms of out of sample performance and constraint satisfaction guarantees. We implemented our method in the open-source package LROPT.
10.Minimal Sparsity for Second-Order Moment-SOS Relaxations of the AC-OPF Problem
Authors:Adrien Le Franc LAAS-POP, Victor Magron LAAS-POP,IMT, Jean-Bernard Lasserre LAAS-POP, Manuel Ruiz, Patrick Panciatici
Abstract: AC-OPF (Alternative Current Optimal Power Flow)aims at minimizing the operating costs of a power gridunder physical constraints on voltages and power injections.Its mathematical formulation results in a nonconvex polynomial optimizationproblem which is hard to solve in general,but that can be tackled by a sequence of SDP(Semidefinite Programming) relaxationscorresponding to the steps of the moment-SOS (Sums-Of-Squares) hierarchy.Unfortunately, the size of these SDPs grows drastically in the hierarchy,so that even second-order relaxationsexploiting the correlative sparsity pattern of AC-OPFare hardly numerically tractable for largeinstances -- with thousands of power buses.Our contribution lies in a new sparsityframework, termed minimal sparsity, inspiredfrom the specific structure of power flowequations.Despite its heuristic nature, numerical examples show that minimal sparsity allows the computation ofhighly accurate second-order moment-SOS relaxationsof AC-OPF, while requiring far less computing time and memory resources than the standard correlative sparsity pattern. Thus, we manage to compute second-order relaxations on test caseswith about 6000 power buses, which we believe to be unprecedented.
1.Adaptive Localized Cayley Parametrization for Optimization over Stiefel Manifold
Authors:Keita Kume, Isao Yamada
Abstract: We present an adaptive parametrization strategy for optimization problems over the Stiefel manifold by using generalized Cayley transforms to utilize powerful Euclidean optimization algorithms efficiently. The generalized Cayley transform can translate an open dense subset of the Stiefel manifold into a vector space, and the open dense subset is determined according to a tunable parameter called a center point. With the generalized Cayley transform, we recently proposed the naive Cayley parametrization, which reformulates the optimization problem over the Stiefel manifold as that over the vector space. Although this reformulation enables us to transplant powerful Euclidean optimization algorithms, their convergences may become slow by a poor choice of center points. To avoid such a slow convergence, in this paper, we propose to estimate adaptively 'good' center points so that the reformulated problem can be solved faster. We also present a unified convergence analysis, regarding the gradient, in cases where fairly standard Euclidean optimization algorithms are employed in the proposed adaptive parametrization strategy. Numerical experiments demonstrate that (i) the proposed strategy succeeds in escaping from the slow convergence observed in the naive Cayley parametrization strategy; (ii) the proposed strategy outperforms the standard strategy which employs a retraction.
2.Communication Efficient Distributed Newton Method with Fast Convergence Rates
Authors:Chengchang Liu, Lesi Chen, Luo Luo, John C. S. Lui
Abstract: We propose a communication and computation efficient second-order method for distributed optimization. For each iteration, our method only requires $\mathcal{O}(d)$ communication complexity, where $d$ is the problem dimension. We also provide theoretical analysis to show the proposed method has the similar convergence rate as the classical second-order optimization algorithms. Concretely, our method can find~$\big(\epsilon, \sqrt{dL\epsilon}\,\big)$-second-order stationary points for nonconvex problem by $\mathcal{O}\big(\sqrt{dL}\,\epsilon^{-3/2}\big)$ iterations, where $L$ is the Lipschitz constant of Hessian. Moreover, it enjoys a local superlinear convergence under the strongly-convex assumption. Experiments on both convex and nonconvex problems show that our proposed method performs significantly better than baselines.
3.A Parameter-Free Conditional Gradient Method for Composite Minimization under Hölder Condition
Authors:Masaru Ito, Zhaosong Lu, Chuan He
Abstract: In this paper we consider a composite optimization problem that minimizes the sum of a weakly smooth function and a convex function with either a bounded domain or a uniformly convex structure. In particular, we first present a parameter-dependent conditional gradient method for this problem, whose step sizes require prior knowledge of the parameters associated with the H\"older continuity of the gradient of the weakly smooth function, and establish its rate of convergence. Given that these parameters could be unknown or known but possibly conservative, such a method may suffer from implementation issue or slow convergence. We therefore propose a parameter-free conditional gradient method whose step size is determined by using a constructive local quadratic upper approximation and an adaptive line search scheme, without using any problem parameter. We show that this method achieves the same rate of convergence as the parameter-dependent conditional gradient method. Preliminary experiments are also conducted and illustrate the superior performance of the parameter-free conditional gradient method over the methods with some other step size rules.
4.Necessary and sufficient conditions for unique solvability of absolute value equations: A Survey
Authors:Shubham Kumar, Deepmala
Abstract: In this survey paper, we focus on the necessary and sufficient conditions for the unique solvability and unsolvability of the absolute value equations (AVEs) during the last twenty years (2004 to 2023). We discussed unique solvability conditions for various types of AVEs like standard absolute value equation (AVE), Generalized AVE (GAVE), New generalized AVE (NGAVE), Triple AVE (TAVE) and a class of NGAVE based on interval matrix, P-matrix, singular value conditions, spectral radius and $\mathcal{W}$-property. Based on the unique solution of AVEs, we also discussed unique solvability conditions for linear complementarity problems (LCP) and horizontal linear complementarity problems (HLCP).
1.Stochastic First-Order Algorithms for Constrained Distributionally Robust Optimization
Authors:Hyungki Im, Paul Grigas
Abstract: We consider distributionally robust optimization (DRO) problems, reformulated as distributionally robust feasibility (DRF) problems, with multiple expectation constraints. We propose a generic stochastic first-order meta-algorithm, where the decision variables and uncertain distribution parameters are each updated separately by applying stochastic first-order methods. We then specialize our results to the case of using two specific versions of stochastic mirror descent (SMD): (i) a novel approximate version of SMD to update the decision variables, and (ii) the bandit mirror descent method to update the distribution parameters in the case of $\chi^2$-divergence sets. For this specialization, we demonstrate that the total number of iterations is independent of the dimensions of the decision variables and distribution parameters. Moreover, the cost per iteration to update both sets of variables is nearly independent of the dimension of the distribution parameters, allowing for high dimensional ambiguity sets. Furthermore, we show that the total number of iterations of our algorithm has a logarithmic dependence on the number of constraints. Experiments on logistic regression with fairness constraints, personalized parameter selection in a social network, and the multi-item newsvendor problem verify the theoretical results and show the usefulness of the algorithm, in particular when the dimension of the distribution parameters is large.
1.Highly Smoothness Zero-Order Methods for Solving Optimization Problems under PL Condition
Authors:Aleksandr Lobanov, Alexander Gasnikov, Fedor Stonyakin
Abstract: In this paper, we study the black box optimization problem under the Polyak--Lojasiewicz (PL) condition, assuming that the objective function is not just smooth, but has higher smoothness. By using "kernel-based" approximation instead of the exact gradient in Stochastic Gradient Descent method, we improve the best known results of convergence in the class of gradient-free algorithms solving problem under PL condition. We generalize our results to the case where a zero-order oracle returns a function value at a point with some adversarial noise. We verify our theoretical results on the example of solving a system of nonlinear equations.
2.First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities
Authors:Aleksandr Beznosikov, Sergey Samsonov, Marina Sheshukova, Alexander Gasnikov, Alexey Naumov, Eric Moulines
Abstract: This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the underlying noise sequence, we use the randomized batching scheme, which is based on the multilevel Monte Carlo method. Moreover, our technique allows us to eliminate the limiting assumptions of previous research on Markov noise, such as the need for a bounded domain and uniformly bounded stochastic gradients. Our extension to variational inequalities under Markovian noise is original. Additionally, we provide lower bounds that match the oracle complexity of our method in the case of strongly convex optimization problems.
3.Neural incomplete factorization: learning preconditioners for the conjugate gradient method
Authors:Paul Häusner, Ozan Öktem, Jens Sjölund
Abstract: In this paper, we develop a novel data-driven approach to accelerate solving large-scale linear equation systems encountered in scientific computing and optimization. Our method utilizes self-supervised training of a graph neural network to generate an effective preconditioner tailored to the specific problem domain. By replacing conventional hand-crafted preconditioners used with the conjugate gradient method, our approach, named neural incomplete factorization (NeuralIF), significantly speeds-up convergence and computational efficiency. At the core of our method is a novel message-passing block, inspired by sparse matrix theory, that aligns with the objective to find a sparse factorization of the matrix. We evaluate our proposed method on both a synthetic and a real-world problem arising from scientific computing. Our results demonstrate that NeuralIF consistently outperforms the most common general-purpose preconditioners, including the incomplete Cholesky method, achieving competitive performance across various metrics even outside the training data distribution.
4.Certificates of Nonexistence for Lyapunov-Based Stability, Stabilizability and Detectability of LPV Systems
Authors:T. J. Meijer, V. S. Dolk, W. P. M. H. Heemels
Abstract: By computing Lyapunov functions of a certain, convenient structure, Lyapunov-based methods guarantee stability properties of the system or, when performing synthesis, of the relevant closed-loop or error dynamics. In doing so, they provide conclusive affirmative answers to many analysis and design questions in systems and control. When these methods fail to produce a feasible solution, however, they often remain inconclusive due to (a) the method being conservative or (b) the fact that there may be multiple causes for infeasibility, such as ill-conditioning, solver tolerances or true infeasibility. To overcome this, we develop LMI-based theorems of alternatives based upon which we can guarantee, by computing a so-called certificate of nonexistence, that no poly-quadratic Lyapunov function exists for a given linear parameter-varying system. We extend these ideas to also certify the nonexistence of controllers and observers for which the corresponding closed-loop/error dynamics admit a poly-quadratic Lyapunov function. Finally, we illustrate our results in some numerical case studies.
5.An Optimal Structured Zeroth-order Algorithm for Non-smooth Optimization
Authors:Marco Rando, Cesare Molinari, Lorenzo Rosasco, Silvia Villa
Abstract: Finite-difference methods are a class of algorithms designed to solve black-box optimization problems by approximating a gradient of the target function on a set of directions. In black-box optimization, the non-smooth setting is particularly relevant since, in practice, differentiability and smoothness assumptions cannot be verified. To cope with nonsmoothness, several authors use a smooth approximation of the target function and show that finite difference methods approximate its gradient. Recently, it has been proved that imposing a structure in the directions allows improving performance. However, only the smooth setting was considered. To close this gap, we introduce and analyze O-ZD, the first structured finite-difference algorithm for non-smooth black-box optimization. Our method exploits a smooth approximation of the target function and we prove that it approximates its gradient on a subset of random {\em orthogonal} directions. We analyze the convergence of O-ZD under different assumptions. For non-smooth convex functions, we obtain the optimal complexity. In the non-smooth non-convex setting, we characterize the number of iterations needed to bound the expected norm of the smoothed gradient. For smooth functions, our analysis recovers existing results for structured zeroth-order methods for the convex case and extends them to the non-convex setting. We conclude with numerical simulations where assumptions are satisfied, observing that our algorithm has very good practical performances.
6.Hybrid Methods in Polynomial Optimisation
Authors:Johannes Aspman, Gilles Bareilles, Vyacheslav Kungurtsev, Jakub Marecek, Martin Takáč
Abstract: The Moment/Sum-of-squares hierarchy provides a way to compute the global minimizers of polynomial optimization problems (POP), at the cost of solving a sequence of increasingly large semidefinite programs (SDPs). We consider large-scale POPs, for which interior-point methods are no longer able to solve the resulting SDPs. We propose an algorithm that combines a first-order method for solving the SDP relaxation, and a second-order method on a non-convex problem obtained from the POP. The switch from the first to the second-order method is based on a quantitative criterion, whose satisfaction ensures that Newton's method converges quadratically from its first iteration. This criterion leverages the point-estimation theory of Smale and the active-set identification. We illustrate the methodology to obtain global minimizers of large-scale optimal power flow problems.
7.Accelerated Methods for Riemannian Min-Max Optimization Ensuring Bounded Geometric Penalties
Authors:David Martínez-Rubio, Christophe Roux, Christopher Criscitiello, Sebastian Pokutta
Abstract: In this work, we study optimization problems of the form $\min_x \max_y f(x, y)$, where $f(x, y)$ is defined on a product Riemannian manifold $\mathcal{M} \times \mathcal{N}$ and is $\mu_x$-strongly geodesically convex (g-convex) in $x$ and $\mu_y$-strongly g-concave in $y$, for $\mu_x, \mu_y \geq 0$. We design accelerated methods when $f$ is $(L_x, L_y, L_{xy})$-smooth and $\mathcal{M}$, $\mathcal{N}$ are Hadamard. To that aim we introduce new g-convex optimization results, of independent interest: we show global linear convergence for metric-projected Riemannian gradient descent and improve existing accelerated methods by reducing geometric constants. Additionally, we complete the analysis of two previous works applying to the Riemannian min-max case by removing an assumption about iterates staying in a pre-specified compact set.
8.Two-timescale Extragradient for Finding Local Minimax Points
Authors:Jiseok Chae, Kyuwon Kim, Donghwan Kim
Abstract: Minimax problems are notoriously challenging to optimize. However, we demonstrate that the two-timescale extragradient can be a viable solution. By utilizing dynamical systems theory, we show that it converges to points that satisfy the second-order necessary condition of local minimax points, under a mild condition. This work surpasses all previous results as we eliminate a crucial assumption that the Hessian, with respect to the maximization variable, is nondegenerate.
9.Approaching Collateral Optimization for NISQ and Quantum-Inspired Computing
Authors:Megan Giron, Georgios Korpas, Waqas Parvaiz, Prashant Malik, Johannes Aspman
Abstract: Collateral optimization refers to the systematic allocation of financial assets to satisfy obligations or secure transactions, while simultaneously minimizing costs and optimizing the usage of available resources. {This involves assessing number of characteristics, such as cost of funding and quality of the underlying assets to ascertain the optimal collateral quantity to be posted to cover exposure arising from a given transaction or a set of transactions. One of the common objectives is to minimise the cost of collateral required to mitigate the risk associated with a particular transaction or a portfolio of transactions while ensuring sufficient protection for the involved parties}. Often, this results in a large-scale combinatorial optimization problem. In this study, we initially present a Mixed Integer Linear Programming (MILP) formulation for the collateral optimization problem, followed by a Quadratic Unconstrained Binary optimization (QUBO) formulation in order to pave the way towards approaching the problem in a hybrid-quantum and NISQ-ready way. We conduct local computational small-scale tests using various Software Development Kits (SDKs) and discuss the behavior of our formulations as well as the potential for performance enhancements. We further survey the recent literature that proposes alternative ways to attack combinatorial optimization problems suitable for collateral optimization.
1.Block Coordinate Descent on Smooth Manifolds
Authors:Liangzu Peng, René Vidal
Abstract: Block coordinate descent is an optimization paradigm that iteratively updates one block of variables at a time, making it quite amenable to big data applications due to its scalability and performance. Its convergence behavior has been extensively studied in the (block-wise) convex case, but it is much less explored in the non-convex case. In this paper we analyze the convergence of block coordinate methods on non-convex sets and derive convergence rates on smooth manifolds under natural or weaker assumptions than prior work. Our analysis applies to many non-convex problems (e.g., generalized PCA, optimal transport, matrix factorization, Burer-Monteiro factorization, outlier-robust estimation, alternating projection, maximal coding rate reduction, neural collapse, adversarial attacks, homomorphic sensing), either yielding novel corollaries or recovering previously known results.
2.Accelerated Nonconvex ADMM with Self-Adaptive Penalty for Rank-Constrained Model Identification
Authors:Qingyuan Liu, Zhengchao Huang, Hao Ye, Dexian Huang, Chao Shang
Abstract: The alternating direction method of multipliers (ADMM) has been widely adopted in low-rank approximation and low-order model identification tasks; however, the performance of nonconvex ADMM is highly reliant on the choice of penalty parameter. To accelerate ADMM for solving rankconstrained identification problems, this paper proposes a new self-adaptive strategy for automatic penalty update. Guided by first-order analysis of the increment of the augmented Lagrangian, the self-adaptive penalty updating enables effective and balanced minimization of both primal and dual residuals and thus ensures a stable convergence. Moreover, improved efficiency can be obtained within the Anderson acceleration scheme. Numerical examples show that the proposed strategy significantly accelerates the convergence of nonconvex ADMM while alleviating the critical reliance on tedious tuning of penalty parameters.
3.The Minimization of Piecewise Functions: Pseudo Stationarity
Authors:Ying Cui, Junyi Liu, Jong-Shi Pang
Abstract: There are many significant applied contexts that require the solution of discontinuous optimization problems in finite dimensions. Yet these problems are very difficult, both computationally and analytically. With the functions being discontinuous and a minimizer (local or global) of the problems, even if it exists, being impossible to verifiably compute, a foremost question is what kind of ''stationary solutions'' one can expect to obtain; these solutions provide promising candidates for minimizers; i.e., their defining conditions are necessary for optimality. Motivated by recent results on sparse optimization, we introduce in this paper such a kind of solution, termed ''pseudo B- (for Bouligand) stationary solution'', for a broad class of discontinuous piecewise continuous optimization problems with objective and constraint defined by indicator functions of the positive real axis composite with functions that are possibly nonsmooth. We present two approaches for computing such a solution. One approach is based on lifting the problem to a higher dimension via the epigraphical formulation of the indicator functions; this requires the addition of some auxiliary variables. The other approach is based on certain continuous (albeit not necessarily differentiable) piecewise approximations of the indicator functions and the convergence to a pseudo B-stationary solution of the original problem is established. The conditions for convergence are discussed and illustrated by an example.
4.Decentralized Control of Linear Systems with Private Input and Measurement Information
Authors:Juanjuan Xu, Huanshui Zhang
Abstract: In this paper, we study the linear quadratic (LQ) optimal control problem of linear systems with private input and measurement information. The main challenging lies in the unavailability of other regulators' historical input information. To overcome this difficulty, we introduce a kind of novel observers by using the private input and measurement information and accordingly design a kind of new decentralized controllers. In particular, it is verified that the corresponding cost function under the proposed decentralized controllers are asymptotically optimal as comparison with the optimal cost under optimal state-feedback controller. The presented results in this paper are new to the best of our knowledge, which represent the fundamental contribution to classical decentralized control.
5.Improved Complexity Analysis of the Sinkhorn and Greenkhorn Algorithms for Optimal Transport
Authors:Jianzhou Luo, Dingchuan Yang, Ke Wei
Abstract: The Sinkhorn algorithm is a widely used method for solving the optimal transport problem, and the Greenkhorn algorithm is one of its variants. While there are modified versions of these two algorithms whose computational complexities are $O({n^2\|C\|_\infty^2\log n}/{\varepsilon^2})$ to achieve an $\varepsilon$-accuracy, the best known complexities for the vanilla versions are $O({n^2\|C\|_\infty^3\log n}/{\varepsilon^3})$. In this paper we fill this gap and show that the complexities of the vanilla Sinkhorn and Greenkhorn algorithms are indeed $O({n^2\|C\|_\infty^2\log n}/{\varepsilon^2})$. The analysis relies on the equicontinuity of the dual variables of the entropic regularized optimal transport problem, which is of independent interest.
6.A discrete-time Pontryagin maximum principle under rate constraints
Authors:Siddhartha Ganguly, Souvik Das, Debasish Chatterjee, Ravi Banavar
Abstract: Limited bandwidth and limited saturation in actuators are practical concerns in control systems. Mathematically, these limitations manifest as constraints being imposed on the control actions, their rates of change, and more generally, the global behavior of their paths. While the problem of actuator saturation has been studied extensively, little attention has been devoted to the problem of actuators having limited bandwidth. While attempts have been made in the direction of incorporating frequency constraints on state-action trajectories before, rate constraints on the control at the design stage have not been studied extensively in the discrete-time regime. This article contributes toward filling this lacuna. In particular, we establish a new discrete-time Pontryagin maximum principle with rate constraints being imposed on the control trajectories, and derive first-order necessary conditions for optimality. A brief discussion on the existence of optimal control is included, and numerical examples are provided to illustrate the results.
7.A note on the computational complexity of the moment-SOS hierarchy for polynomial optimization
Authors:Sander Gribling, Sven Polak, Lucas Slot
Abstract: The moment-sum-of-squares (moment-SOS) hierarchy is one of the most celebrated and widely applied methods for approximating the minimum of an n-variate polynomial over a feasible region defined by polynomial (in)equalities. A key feature of the hierarchy is that, at a fixed level, it can be formulated as a semidefinite program of size polynomial in the number of variables n. Although this suggests that it may therefore be computed in polynomial time, this is not necessarily the case. Indeed, as O'Donnell (2017) and later Raghavendra & Weitz (2017) show, there exist examples where the sos-representations used in the hierarchy have exponential bit-complexity. We study the computational complexity of the moment-SOS hierarchy, complementing and expanding upon earlier work of Raghavendra & Weitz (2017). In particular, we establish algebraic and geometric conditions under which polynomial-time computation is guaranteed to be possible.
8.ReSync: Riemannian Subgradient-based Robust Rotation Synchronization
Authors:Huikang Liu, Xiao Li, Anthony Man-Cho So
Abstract: This work presents ReSync, a Riemannian subgradient-based algorithm for solving the robust rotation synchronization problem, which arises in various engineering applications. ReSync solves a least-unsquared minimization formulation over the rotation group, which is nonsmooth and nonconvex, and aims at recovering the underlying rotations directly. We provide strong theoretical guarantees for ReSync under the random corruption setting. Specifically, we first show that the initialization procedure of ReSync yields a proper initial point that lies in a local region around the ground-truth rotations. We next establish the weak sharpness property of the aforementioned formulation and then utilize this property to derive the local linear convergence of ReSync to the ground-truth rotations. By combining these guarantees, we conclude that ReSync converges linearly to the ground-truth rotations under appropriate conditions. Experiment results demonstrate the effectiveness of ReSync.
9.Approximating Multiobjective Optimization Problems: How exact can you be?
Authors:Cristina Bazgan, Arne Herzel, Stefan Ruzika, Clemens Thielen, Daniel Vanderpooten
Abstract: It is well known that, under very weak assumptions, multiobjective optimization problems admit $(1+\varepsilon,\dots,1+\varepsilon)$-approximation sets (also called $\varepsilon$-Pareto sets) of polynomial cardinality (in the size of the instance and in $\frac{1}{\varepsilon}$). While an approximation guarantee of $1+\varepsilon$ for any $\varepsilon>0$ is the best one can expect for singleobjective problems (apart from solving the problem to optimality), even better approximation guarantees than $(1+\varepsilon,\dots,1+\varepsilon)$ can be considered in the multiobjective case since the approximation might be exact in some of the objectives. Hence, in this paper, we consider partially exact approximation sets that require to approximate each feasible solution exactly, i.e., with an approximation guarantee of $1$, in some of the objectives while still obtaining a guarantee of $1+\varepsilon$ in all others. We characterize the types of polynomial-cardinality, partially exact approximation sets that are guaranteed to exist for general multiobjective optimization problems. Moreover, we study minimum-cardinality partially exact approximation sets concerning (weak) efficiency of the contained solutions and relate their cardinalities to the minimum cardinality of a $(1+\varepsilon,\dots,1+\varepsilon)$-approximation set.
10.Efficiently Constructing Convex Approximation Sets in Multiobjective Optimization Problems
Authors:Stephan Helfrich, Stefan Ruzika, Clemens Thielen
Abstract: Convex approximation sets for multiobjective optimization problems are a well-studied relaxation of the common notion of approximation sets. Instead of approximating each image of a feasible solution by the image of some solution in the approximation set up to a multiplicative factor in each component, a convex approximation set only requires this multiplicative approximation to be achieved by some convex combination of finitely many images of solutions in the set. This makes convex approximation sets efficiently computable for a wide range of multiobjective problems - even for many problems for which (classic) approximations sets are hard to compute. In this article, we propose a polynomial-time algorithm to compute convex approximation sets that builds upon an exact or approximate algorithm for the weighted sum scalarization and is, therefore, applicable to a large variety of multiobjective optimization problems. The provided convex approximation quality is arbitrarily close to the approximation quality of the underlying algorithm for the weighted sum scalarization. In essence, our algorithm can be interpreted as an approximate variant of the dual variant of Benson's Outer Approximation Algorithm. Thus, in contrast to existing convex approximation algorithms from the literature, information on solutions obtained during the approximation process is utilized to significantly reduce both the practical running time and the cardinality of the returned solution sets while still guaranteeing the same worst-case approximation quality. We underpin these advantages by the first comparison of all existing convex approximation algorithms on several instances of the triobjective knapsack problem and the triobjective symmetric metric traveling salesman problem.
11.The Cooperative Maximum Capture Facility Location Problem
Authors:Concepción Domínguez, Ricardo Gázquez, Juan Miguel Morales, Salvador Pineda
Abstract: In the Maximum Capture Facility Location (MCFL) problem with a binary choice rule, a company intends to locate a series of facilities to maximize the captured demand, and customers patronize the facility that maximizes their utility. In this work, we generalize the MCFL problem assuming that the facilities of the decision maker act cooperatively to increase the customers' utility over the company. We propose a utility maximization rule between the captured utility of the decision maker and the opt-out utility of a competitor already installed in the market. Furthermore, we model the captured utility by means of an Ordered Median function (OMf) of the partial utilities of newly open facilities. We name this problem "the Cooperative Maximum Capture Facility Location problem" (CMCFL). The OMf serves as a means to compute the utility of each customer towards the company as an aggregation of ordered partial utilities, and constitutes a unifying framework for CMCFL models. We introduce a multiperiod non-linear bilevel formulation for the CMCFL with an embedded assignment problem characterizing the captured utilities. For this model, two exact resolution approaches are presented: a MILP reformulation with valid inequalities and an effective approach based on Benders' decomposition. Extensive computational experiments are provided to test our results with randomly generated data and an application to the location of charging stations for electric vehicles in the city of Trois-Rivi\`eres, Qu\`ebec, is addressed.
12.Using Scalarizations for the Approximation of Multiobjective Optimization Problems: Towards a General Theory
Authors:Stephan Helfrich, Arne Herzel, Stefan Ruzika, Clemens Thielen
Abstract: We study the approximation of general multiobjective optimization problems with the help of scalarizations. Existing results state that multiobjective minimization problems can be approximated well by norm-based scalarizations. However, for multiobjective maximization problems, only impossibility results are known so far. Countering this, we show that all multiobjective optimization problems can, in principle, be approximated equally well by scalarizations. In this context, we introduce a transformation theory for scalarizations that establishes the following: Suppose there exists a scalarization that yields an approximation of a certain quality for arbitrary instances of multiobjective optimization problems with a given decomposition specifying which objective functions are to be minimized / maximized. Then, for each other decomposition, our transformation yields another scalarization that yields the same approximation quality for arbitrary instances of problems with this other decomposition. In this sense, the existing results about the approximation via scalarizations for minimization problems carry over to any other objective decomposition -- in particular, to maximization problems -- when suitably adapting the employed scalarization. We further provide necessary and sufficient conditions on a scalarization such that its optimal solutions achieve a constant approximation quality. We give an upper bound on the best achievable approximation quality that applies to general scalarizations and is tight for the majority of norm-based scalarizations applied in the context of multiobjective optimization. As a consequence, none of these norm-based scalarizations can induce approximation sets for optimization problems with maximization objectives, which unifies and generalizes the existing impossibility results concerning the approximation of maximization problems.
13.A Privacy-Preserving Finite-Time Push-Sum based Gradient Method for Distributed Optimization over Digraphs
Authors:Xiaomeng Chen, Wei Jiang, Themistoklis Charalambous, Ling Shi
Abstract: This paper addresses the problem of distributed optimization, where a network of agents represented as a directed graph (digraph) aims to collaboratively minimize the sum of their individual cost functions. Existing approaches for distributed optimization over digraphs, such as Push-Pull, require agents to exchange explicit state values with their neighbors in order to reach an optimal solution. However, this can result in the disclosure of sensitive and private information. To overcome this issue, we propose a state-decomposition-based privacy-preserving finite-time push-sum (PrFTPS) algorithm without any global information such as network size or graph diameter. Then, based on PrFTPS, we design a gradient descent algorithm (PrFTPS-GD) to solve the distributed optimization problem. It is proved that under PrFTPS-GD, the privacy of each agent is preserved and the linear convergence rate related to the optimization iteration number is achieved. Finally, numerical simulations are provided to illustrate the effectiveness of the proposed approach.
14.Error Feedback Shines when Features are Rare
Authors:Peter Richtárik, Elnur Gasanov, Konstantin Burlachenko
Abstract: We provide the first proof that gradient descent $\left({\color{green}\sf GD}\right)$ with greedy sparsification $\left({\color{green}\sf TopK}\right)$ and error feedback $\left({\color{green}\sf EF}\right)$ can obtain better communication complexity than vanilla ${\color{green}\sf GD}$ when solving the distributed optimization problem $\min_{x\in \mathbb{R}^d} {f(x)=\frac{1}{n}\sum_{i=1}^n f_i(x)}$, where $n$ = # of clients, $d$ = # of features, and $f_1,\dots,f_n$ are smooth nonconvex functions. Despite intensive research since 2014 when ${\color{green}\sf EF}$ was first proposed by Seide et al., this problem remained open until now. We show that ${\color{green}\sf EF}$ shines in the regime when features are rare, i.e., when each feature is present in the data owned by a small number of clients only. To illustrate our main result, we show that in order to find a random vector $\hat{x}$ such that $\lVert {\nabla f(\hat{x})} \rVert^2 \leq \varepsilon$ in expectation, ${\color{green}\sf GD}$ with the ${\color{green}\sf Top1}$ sparsifier and ${\color{green}\sf EF}$ requires ${\cal O} \left(\left( L+{\color{blue}r} \sqrt{ \frac{{\color{red}c}}{n} \min \left( \frac{{\color{red}c}}{n} \max_i L_i^2, \frac{1}{n}\sum_{i=1}^n L_i^2 \right) }\right) \frac{1}{\varepsilon} \right)$ bits to be communicated by each worker to the server only, where $L$ is the smoothness constant of $f$, $L_i$ is the smoothness constant of $f_i$, ${\color{red}c}$ is the maximal number of clients owning any feature ($1\leq {\color{red}c} \leq n$), and ${\color{blue}r}$ is the maximal number of features owned by any client ($1\leq {\color{blue}r} \leq d$). Clearly, the communication complexity improves as ${\color{red}c}$ decreases (i.e., as features become more rare), and can be much better than the ${\cal O}({\color{blue}r} L \frac{1}{\varepsilon})$ communication complexity of ${\color{green}\sf GD}$ in the same regime.
15.Mathematical Models and Exact Algorithms for the Colored Bin Packing Problem
Authors:Yulle G. F. Borges, Rafael C. S. Schouery, Flávio K. Miyazawa
Abstract: This paper focuses on exact approaches for the Colored Bin Packing Problem (CBPP), a generalization of the classical one-dimensional Bin Packing Problem in which each item has, in addition to its length, a color, and no two items of the same color can appear consecutively in the same bin. To simplify modeling, we present a characterization of any feasible packing of this problem in a way that does not depend on its ordering. Furthermore, we present four exact algorithms for the CBPP. First, we propose a generalization of Val\'erio de Carvalho's arc flow formulation for the CBPP using a graph with multiple layers, each representing a color. Second, we present an improved arc flow formulation that uses a more compact graph and has the same linear relaxation bound as the first formulation. And finally, we design two exponential set-partition models based on reductions to a generalized vehicle routing problem, which are solved by a branch-cut-and-price algorithm through VRPSolver. To compare the proposed algorithms, a varied benchmark set with 574 instances of the CBPP is presented. Results show that the best model, our improved arc flow formulation, was able to solve over 62% of the proposed instances to optimality, the largest of which with 500 items and 37 colors. While being able to solve fewer instances in total, the set-partition models exceeded their arc flow counterparts in instances with a very small number of colors.
16.Mean field type control with species dependent dynamics via structured tensor optimization
Authors:Axel Ringh, Isabel Haasler, Yongxin Chen, Johan Karlsson
Abstract: In this work we consider mean field type control problems with multiple species that have different dynamics. We formulate the discretized problem using a new type of entropy-regularized multimarginal optimal transport problems where the cost is a decomposable structured tensor. A novel algorithm for solving such problems is derived, using this structure and leveraging recent results in entropy-regularized optimal transport. The algorithm is then demonstrated on a numerical example in robot coordination problem for search and rescue, where three different types of robots are used to cover a given area at minimal cost.
17.Inverse optimal control for averaged cost per stage linear quadratic regulators
Authors:Han Zhang, Axel Ringh
Abstract: Inverse Optimal Control (IOC) is a powerful framework for learning a behaviour from observations of experts. The framework aims to identify the underlying cost function that the observed optimal trajectories (the experts' behaviour) are optimal with respect to. In this work, we considered the case of identifying the cost and the feedback law from observed trajectories generated by an ``average cost per stage" linear quadratic regulator. We show that identifying the cost is in general an ill-posed problem, and give necessary and sufficient conditions for non-identifiability. Moreover, despite the fact that the problem is in general ill-posed, we construct an estimator for the cost function and show that the control gain corresponding to this estimator is a statistically consistent estimator for the true underlying control gain. In fact, the constructed estimator is based on convex optimization, and hence the proved statistical consistency is also observed in practice. We illustrate the latter by applying the method on a simulation example from rehabilitation robotics.
18.Algorithms for the Bin Packing Problem with Scenarios
Authors:Yulle G. F. Borges, Vinícius L. de Lima, Flávio K. Miyazawa, Lehilton L. C. Pedrosa, Thiago A. de Queiroz, Rafael C. S. Schouery
Abstract: This paper presents theoretical and practical results for the bin packing problem with scenarios, a generalization of the classical bin packing problem which considers the presence of uncertain scenarios, of which only one is realized. For this problem, we propose an absolute approximation algorithm whose ratio is bounded by the square root of the number of scenarios times the approximation ratio for an algorithm for the vector bin packing problem. We also show how an asymptotic polynomial-time approximation scheme is derived when the number of scenarios is constant. As a practical study of the problem, we present a branch-and-price algorithm to solve an exponential model and a variable neighborhood search heuristic. To speed up the convergence of the exact algorithm, we also consider lower bounds based on dual feasible functions. Results of these algorithms show the competence of the branch-and-price in obtaining optimal solutions for about 59% of the instances considered, while the combined heuristic and branch-and-price optimally solved 62% of the instances considered.
19.LQG Risk-Sensitive Mean Field Games with a Major Agent: A Variational Approach
Authors:Hanchao Liu, Dena Firoozi, Michèle Breton
Abstract: Risk sensitivity plays an important role in the study of finance and economics as risk-neutral models cannot capture and justify all economic behaviors observed in reality. Risk-sensitive mean field game theory was developed recently for systems where there exists a large number of indistinguishable, asymptotically negligible and heterogeneous risk-sensitive players, who are coupled via the empirical distribution of state across population. In this work, we extend the theory of Linear Quadratic Gaussian risk-sensitive mean-field games to the setup where there exists one major agent as well as a large number of minor agents. The major agent has a significant impact on each minor agent and its impact does not collapse with the increase in the number of minor agents. Each agent is subject to linear dynamics with an exponential-of-integral quadratic cost functional. Moreover, all agents interact via the average state of minor agents (so-called empirical mean field) and the major agent's state. We develop a variational analysis approach to derive the best response strategies of agents in the limiting case where the number of agents goes to infinity. We establish that the set of obtained best-response strategies yields a Nash equilibrium in the limiting case and an $\varepsilon$-Nash equilibrium in the finite player case. We conclude the paper with an illustrative example.
1.One-step differentiation of iterative algorithms
Authors:Jérôme Bolte, Edouard Pauwels, Samuel Vaiter
Abstract: In appropriate frameworks, automatic differentiation is transparent to the user at the cost of being a significant computational burden when the number of operations is large. For iterative algorithms, implicit differentiation alleviates this issue but requires custom implementation of Jacobian evaluation. In this paper, we study one-step differentiation, also known as Jacobian-free backpropagation, a method as easy as automatic differentiation and as performant as implicit differentiation for fast algorithms (e.g., superlinear optimization methods). We provide a complete theoretical approximation analysis with specific examples (Newton's method, gradient descent) along with its consequences in bilevel optimization. Several numerical examples illustrate the well-foundness of the one-step estimator.
2.Linear Boundary Port-Hamiltonian Systems with Implicitly Defined Energy
Authors:Bernhard Maschke, Arjan van der Schaft
Abstract: In this paper we extend the previously introduced class of boundary port-Hamiltonian systems to boundary c