Search ResultsShowing 1-20 of 87
-
arXiv:2502.05081 (Published 2025-02-07)
Stochastic internal habit formation and optimality
Growth models with internal habit formation have been studied in various settings under the assumption of deterministic dynamics. The purpose of this paper is to explore a stochastic version of the model in Carroll et al. [1997, 2000], one the most influential on the subject. The goal is twofold: on one hand, to determine how far we can advance in the technical study of the model; on the other, to assess whether at least some of the deterministic outcomes remain valid in the stochastic setting. The resulting optimal control problem proves to be challenging, primarily due to the lack of concavity in the objective function. This feature is present in the model even in the deterministic case (see, e.g., Bambi and Gozzi [2020]). We develop an approach based on Dynamic Programming to establish several useful results, including the regularity of the solution to the corresponding HJB equation and a verification theorem. There results lay the groundwork for studying the model optimal paths and comparing them with the deterministic case.
-
arXiv:2412.18164 (Published 2024-12-24)
Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence
Comments: 28 pagesDiffusion models have emerged as powerful tools for generative modeling, demonstrating exceptional capability in capturing target data distributions from large datasets. However, fine-tuning these massive models for specific downstream tasks, constraints, and human preferences remains a critical challenge. While recent advances have leveraged reinforcement learning algorithms to tackle this problem, much of the progress has been empirical, with limited theoretical understanding. To bridge this gap, we propose a stochastic control framework for fine-tuning diffusion models. Building on denoising diffusion probabilistic models as the pre-trained reference dynamics, our approach integrates linear dynamics control with Kullback-Leibler regularization. We establish the well-posedness and regularity of the stochastic control problem and develop a policy iteration algorithm (PI-FT) for numerical solution. We show that PI-FT achieves global convergence at a linear rate. Unlike existing work that assumes regularities throughout training, we prove that the control and value sequences generated by the algorithm maintain the regularity. Additionally, we explore extensions of our framework to parametric settings and continuous-time formulations.
-
Partially Observed Optimal Stochastic Control: Regularity, Optimality, Approximations, and Learning
In this review/tutorial article, we present recent progress on optimal control of partially observed Markov Decision Processes (POMDPs). We first present regularity and continuity conditions for POMDPs and their belief-MDP reductions, where these constitute weak Feller and Wasserstein regularity and controlled filter stability. These are then utilized to arrive at existence results on optimal policies for both discounted and average cost problems, and regularity of value functions. Then, we study rigorous approximation results involving quantization based finite model approximations as well as finite window approximations under controlled filter stability. Finally, we present several recent reinforcement learning theoretic results which rigorously establish convergence to near optimality under both criteria.
-
arXiv:2411.19826 (Published 2024-11-29)
Optimality of Gerver's Sofa
Comments: 119 pages, 21 figuresWe resolve the moving sofa problem by showing that Gerver's construction with 18 curve sections attains the maximum area $2.2195\cdots$.
-
arXiv:2410.02895 (Published 2024-10-03)
Approximation Schemes for POMPDs with Continuous Spaces and Their Near Optimality
We study an approximation method for partially observed Markov decision processes (POMDPs) with continuous spaces. Belief MDP reduction, which has been the standard approach to study POMDPs requires rigorous approximation methods for practical applications, due to the state space being lifted to the space of probability measures. Generalizing recent work, in this paper we present rigorous approximation methods via discretizing the observation space and constructing a fully observed finite MDP model using a finite length history of the discrete observations and control actions. We show that the resulting policy is near-optimal under some regularity assumptions on the channel, and under certain controlled filter stability requirements for the hidden state process. Furthermore, by quantizing the measurements, we are able to utilize refined filter stability conditions. We also provide a Q learning algorithm that uses a finite memory of discretized information variables, and prove its convergence to the optimality equation of the finite fully observed MDP constructed using the approximation method.
-
arXiv:2409.13849 (Published 2024-09-20)
Optimality of a barrier strategy in a spectrally negative Lévy model with a level-dependent intensity of bankruptcy
We consider de Finetti's stochastic control problem for a spectrally negative L\'evy process in an Omega model. In such a model, the (controlled) process is allowed to spend time under the critical level but is then subject to a level-dependent intensity of bankruptcy. First, before considering the control problem, we derive some analytical properties of the corresponding Omega scale functions. Second, we prove that exists a barrier strategy that is optimal for this control problem under a mild assumption on the L\'evy measure. Finally, we analyse numerically the impact of the bankruptcy rate function on the optimal strategy.
-
arXiv:2409.09890 (Published 2024-09-15)
Optimality of Motion Camouflage Under Escape Uncertainty
Categories: math.OCMotion camouflage can be a useful tactic for a pursuer attempting to conceal their true trajectory from their target. Many previous studies determine optimal trajectories subject to motion camouflage constraints, but these analyses do not address when it is optimal to use, nor do they account for the pursuer's inability to predict if and when the target will try to escape. We present an optimal control framework to determine when the pursuer should use motion camouflage amidst uncertainty in the target's escape attempt time. Focusing on the illustrative problem of a male hover fly pursuing a female hover fly for mating, we model the female fly's escape response as the result of a non-homogeneous Poisson point process with a biologically informed rate function, and we obtain and numerically solve two Hamilton-Jacobi-Bellman (HJB) PDEs which encode the pursuer's optimal trajectories. Our numerical experiments and statistics illustrate when it is optimal to use motion camouflage pursuit tactics under varying degrees of the target's visual acuity and tolerance to the pursuer's presence.
-
arXiv:2409.09745 (Published 2024-09-15)
The Optimality of (Accelerated) SGD for High-Dimensional Quadratic Optimization
Comments: 46 pagesStochastic gradient descent (SGD) is a widely used algorithm in machine learning, particularly for neural network training. Recent studies on SGD for canonical quadratic optimization or linear regression show it attains well generalization under suitable high-dimensional settings. However, a fundamental question -- for what kinds of high-dimensional learning problems SGD and its accelerated variants can achieve optimality has yet to be well studied. This paper investigates SGD with two essential components in practice: exponentially decaying step size schedule and momentum. We establish the convergence upper bound for momentum accelerated SGD (ASGD) and propose concrete classes of learning problems under which SGD or ASGD achieves min-max optimal convergence rates. The characterization of the target function is based on standard power-law decays in (functional) linear regression. Our results unveil new insights for understanding the learning bias of SGD: (i) SGD is efficient in learning ``dense'' features where the corresponding weights are subject to an infinity norm constraint; (ii) SGD is efficient for easy problem without suffering from the saturation effect; (iii) momentum can accelerate the convergence rate by order when the learning problem is relatively hard. To our knowledge, this is the first work to clearly identify the optimal boundary of SGD versus ASGD for the problem under mild settings.
-
arXiv:2407.08425 (Published 2024-07-11)
Optimality of vaccination for an SIR epidemic with an ICU constraint
Categories: math.OCThis paper studies an optimal control problem for a class of SIR epidemic models, in scenarios in which the infected population is constrained to be lower than a critical threshold imposed by the ICU (intensive care unit) capacity. The vaccination effort possibly imposed by the health-care deciders is classically modeled by a control input affecting the epidemic dynamic. After a preliminary viability analysis the existence of optimal controls is established, and their structure is characterized by using a state-constrained version of Pontryagin's theorem. The resulting optimal controls necessarily have a bang-bang regime with at most one switch. More precisely, the optimal strategies impose the maximum-allowed vaccination effort in an initial period of time, which can cease only once the ICU constraint can be satisfied without further vaccination. The switching times are characterized in order to identify conditions under which vaccination should be implemented or halted. The uniqueness of the optimal control is also discussed. Numerical examples illustrate our theoretical results and the corresponding optimal strategies. The analysis is eventually extended to the infinite horizon by $\Gamma$-convergence arguments.
-
arXiv:2405.16223 (Published 2024-05-25)
Near Optimality of Lipschitz and Smooth Policies in Controlled Diffusions
Comments: 10 pages. arXiv admin note: substantial text overlap with arXiv:2209.14982Categories: math.OCFor optimal control of diffusions under several criteria, due to computational or analytical reasons, many studies have a apriori assumed control policies to be Lipschitz or smooth, often with no rigorous analysis on whether this restriction entails loss. While optimality of Markov/stationary Markov policies for expected finite horizon/infinite horizon (discounted/ergodic) cost and cost-up-to-exit time optimal control problems can be established under certain technical conditions, an optimal solution is typically only measurable in the state (and time, if the horizon is finite) with no apriori additional structural properties. In this paper, building on our recent work [S. Pradhan and S. Y\"uksel, Continuity of cost in Borkar control topology and implications on discrete space and time approximations for controlled diffusions under several criteria, Electronic Journal of Probability (2024)] establishing the regularity of optimal cost on the space of control policies under the Borkar control topology for a general class of diffusions, we establish near optimality of smooth/Lipschitz continuous policies for optimal control under expected finite horizon, infinite horizon discounted/average, and up-to-exit time cost criteria. Under mild assumptions, we first show that smooth/Lipschitz continuous policies are dense in the space of Markov/stationary Markov policies under the Borkar topology. Then utilizing the continuity of optimal costs as a function of policies on the space of Markov/stationary policies under the Borkar topology, we establish that optimal policies can be approximated by smooth/Lipschitz continuous policies with arbitrary precision. While our results are extensions of our recent work, the practical significance of an explicit statement and accessible presentation dedicated to Lipschitz and smooth policies, given their prominence in the literature, motivates our current paper.
-
arXiv:2404.18794 (Published 2024-04-29)
Optimality and uniqueness of the D_4 root system
We prove that the $D_4$ root system (the set of vertices of the regular $24$-cell) is the unique optimal kissing configuration in $\mathbb R^4$, and is an optimal spherical code. For this, we use semidefinite programming to compute an exact optimal solution to the second level of the Lasserre hierarchy. We also improve the upper bound for the kissing number problem in $\mathbb R^6$ to $77$.
-
arXiv:2403.15159 (Published 2024-03-22)
Near-optimal performance of stochastic economic MPC
Categories: math.OCThis paper presents first results for near optimality in expectation of the closed-loop solutions for stochastic economic MPC. The approach relies on a recently developed turnpike property for stochastic optimal control problems at an optimal stationary process, combined with techniques for analyzing time-varying economic MPC schemes. We obtain near optimality in finite time as well as overtaking and average near optimality on infinite time horizons.
-
arXiv:2401.00542 (Published 2023-12-31)
A relaxation viewpoint to Unbalanced Optimal Transport: duality, optimality and Monge formulation
Comments: 57 pagesWe present a general convex relaxation approach to study a wide class of Unbalanced Optimal Transport problems for finite non-negative measures with possibly different masses. These are obtained as the lower semicontinuous and convex envelope of a cost for non-negative Dirac masses. New general primal-dual formulations, optimality conditions, and metric-topological properties are carefully studied and discussed.
-
arXiv:2310.18164 (Published 2023-10-27)
Optimality of a refraction strategy in the optimal dividends problem with absolutely continuous controls subject to Parisian ruin
We consider de Finetti's optimal dividends problem with absolutely continuous strategies in a spectrally negative L\'evy model with Parisian ruin as the termination time. The problem considered is essentially a generalization of both the control problems considered by Kyprianou, Loeffen & P\'erez (2012) and by Renaud (2019). Using the language of scale functions for Parisian fluctuation theory, and under the assumption that the density of the L\'evy measure is completely monotone, we prove that a refraction dividend strategy is optimal and we characterize the optimal threshold. In particular, we study the effect of the rate of Parisian implementation delays on this optimal threshold.
-
arXiv:2308.08183 (Published 2023-08-16)
Refraction strategies in stochastic control: optimality for a general Lévy process model
Comments: 24 pagesWe revisit an absolutely-continuous version of the stochastic control problem driven by a L\'evy process. A strategy must be absolutely continuous with respect to the Lebesgue measure and the running cost function is assumed to be convex. We show the optimality of a refraction strategy, which adjusts the drift of the state process at a constant rate whenever it surpasses a certain threshold. The optimality holds for a general L\'evy process, generalizing the spectrally negative case presented in Hern\'andez-Hern\'andez et al.(2016).
-
arXiv:2308.03297 (Published 2023-08-07)
Approximate Constrained Discounted Dynamic Programming with Uniform Feasibility and Optimality
Categories: math.OCWe consider a dynamic programming (DP) approach to approximately solving an infinite-horizon constrained Markov decision process (CMDP) problem with a fixed initial-state for the expected total discounted-reward criterion with a uniform-feasibility constraint of the expected total discounted-cost in a deterministic, history-independent, and stationary policy set. We derive a DP-equation that recursively holds for a CMDP problem and its sub-CMDP problems, where each problem, induced from the parameters of the original CMDP problem, admits a uniformly-optimal feasible policy in its policy set associated with the inputs to the problem. A policy constructed from the DP-equation is shown to achieve the optimal values, defined for the CMDP problem the policy is a solution to, at all states. Based on the result, we discuss off-line and on-line computational algorithms, motivated from policy iteration for MDPs, whose output sequences have local convergences for the original CMDP problem.
-
arXiv:2307.14741 (Published 2023-07-27)
Optimality of Split Covariance Intersection Fusion
Categories: math.OCLinear fusion is a cornerstone of estimation theory. Optimal linear fusion was derived by Bar-Shalom and Campo in the 1980s. It requires knowledge of the cross-covariances between the errors of the estimators. In distributed or cooperative systems, these cross-covariances are difficult to compute. To avoid an underestimation of the errors when these cross-covariances are unknown, conservative fusions must be performed. A conservative fusion provides a fused estimator with a covariance bound which is guaranteed to be larger than the true (but not computable) covariance of the error. Previous research by Reinhardt et al. proved that, if no additional assumption is made about the errors of the estimators, the minimal bound for fusing two estimators is given by a fusion called Covariance Intersection (CI). In practice, the errors of the estimators often have an uncorrelated component, because the dynamic or measurement noise is assumed to be independent. In this context, CI is no longer the optimal method and an adaptation called Split Covariance Intersection (SCI) has been designed to take advantage from these uncorrelated components. The contribution of this paper is to prove that SCI is the optimal fusion rule for two estimators under the assumption that they have an uncorrelated component. It is proved that SCI provides the optimal covariance bound with respect to any increasing cost function. To prove the result, a minimal volume that should contain all conservative bounds is derived, and the SCI bounds are proved to be the only bounds that tightly circumscribe this minimal volume.
-
arXiv:2306.15523 (Published 2023-06-27)
Towards the optimality of the ball for the Rayleigh Conjecture concerning the clamped plate
Comments: 28 pages, 3 figuresIn 1995, Nadirashvili and subsequently Ashbaugh and Benguria proved the Rayleigh Conjecture concerning the first eigenvalue of the bilaplacian with clamped boundary conditions in dimension $2$ and $3$. Since then, the conjecture has remained open in dimension $d>3$. In this document, we contribute in answering the conjecture under a particular assumption regarding the critical values of the optimal eigenfunction. More precisely, we prove that if the optimal eigenfunction has no critical value except its minimum and maximum, then the conjecture holds. This is performed thanks to an improvement of Talenti's comparison principle, made possible after a fine study of the geometry of the eigenfunction's nodal domains.
-
arXiv:2305.03608 (Published 2023-05-05)
On the Optimality, Stability, and Feasibility of Control Barrier Functions: An Adaptive Learning-Based Approach
Safety has been a critical issue for the deployment of learning-based approaches in real-world applications. To address this issue, control barrier function (CBF) and its variants have attracted extensive attention for safety-critical control. However, due to the myopic one-step nature of CBF and the lack of principled methods to design the class-$\mathcal{K}$ functions, there are still fundamental limitations of current CBFs: optimality, stability, and feasibility. In this paper, we proposed a novel and unified approach to address these limitations with Adaptive Multi-step Control Barrier Function (AM-CBF), where we parameterize the class-$\mathcal{K}$ function by a neural network and train it together with the reinforcement learning policy. Moreover, to mitigate the myopic nature, we propose a novel \textit{multi-step training and single-step execution} paradigm to make CBF farsighted while the execution remains solving a single-step convex quadratic program. Our method is evaluated on the first and second-order systems in various scenarios, where our approach outperforms the conventional CBF both qualitatively and quantitatively.
-
Distributionally Robust Principal-Agent Problems and Optimality of Contracts
Subjects: 90BxxWe propose a distributionally robust principal agent formulation, which generalizes some common variants of worst-case and Bayesian principal agent problems. We construct a theoretical framework to certify whether any surjective contract family is optimal, and bound its sub-optimality. We then apply the framework to study the optimality of affine contracts. We show with geometric intuition that these simple contract families are optimal when the surplus function is convex and there exists a technology type that is simultaneously least productive and least efficient. We also provide succinct expressions to quantify the optimality gap of any surplus function, based on its concave biconjugate. This new framework complements the current literature in two ways: invention of a new toolset; understanding affine contracts' performance in a larger landscape. Our results also shed light on the technical roots of this question: why are there more positive results in the recent literature that show simple contracts' optimality in robust settings rather than stochastic settings? This phenomenon is related to two technical facts: the sum of quasi-concave functions is not quasi-concave, and the maximization and expectation operators do not commute.