arXiv Analytics

Sign in

Search ResultsShowing 1-6 of 6

Sort by
  1. arXiv:2409.06931 (Published 2024-09-11)

    Flexible block-iterative analysis for the Frank-Wolfe algorithm

    Gábor Braun, Sebastian Pokutta, Zev Woodstock

    We prove that the block-coordinate Frank-Wolfe (BCFW) algorithm converges with state-of-the-art rates in both convex and nonconvex settings under a very mild "block-iterative" assumption, newly allowing for (I) progress without activating the most-expensive linear minimization oracle(s), LMO(s), at every iteration, (II) parallelized updates that do not require all LMOs, and therefore (III) deterministic parallel update strategies that take into account the numerical cost of the problem's LMOs. Our results apply for short-step BCFW as well as an adaptive method for convex functions. New relationships between updated coordinates and primal progress are proven, and a favorable speedup is demonstrated using FrankWolfe.jl.

  2. arXiv:2212.02933 (Published 2022-12-06)

    Alternating Linear Minimization: Revisiting von Neumann's alternating projections

    Gábor Braun, Sebastian Pokutta, Robert Weismantel

    In 1933 von Neumann proved a beautiful result that one can approximate a point in the intersection of two convex sets by alternating projections, i.e., successively projecting on one set and then the other. This algorithm assumes that one has access to projection operators for both sets. In this note, we consider the much weaker setup where we have only access to linear minimization oracles over the convex sets and present an algorithm to find a point in the intersection of two convex sets.

  3. arXiv:2211.14103 (Published 2022-11-25)

    Conditional Gradient Methods

    Gábor Braun, Alejandro Carderera, Cyrille W. Combettes, Hamed Hassani, Amin Karbasi, Aryan Mokhtari, Sebastian Pokutta
    Comments: 238 pages with many figures. The FrankWolfe.jl Julia package (https://github.com/ZIB-IOL/FrankWolfe.jl) providces state-of-the-art implementations of many Frank--Wolfe methods
    Categories: math.OC

    The purpose of this survey is to serve both as a gentle introduction and a coherent overview of state-of-the-art Frank--Wolfe algorithms, also called conditional gradient algorithms, for function minimization. These algorithms are especially useful in convex optimization when linear optimization is cheaper than projections. The selection of the material has been guided by the principle of highlighting crucial ideas as well as presenting new approaches that we believe might become important in the future, with ample citations even of old works imperative in the development of newer methods. Yet, our selection is sometimes biased, and need not reflect consensus of the research community, and we have certainly missed recent important contributions. After all the research area of Frank--Wolfe is very active, making it a moving target. We apologize sincerely in advance for any such distortions and we fully acknowledge: We stand on the shoulder of giants.

  4. arXiv:2101.02087 (Published 2021-01-06)

    Dual Prices for Frank--Wolfe Algorithms

    Gábor Braun, Sebastian Pokutta

    In this note we observe that for constrained convex minimization problems $\min_{x \in P}f(x)$ over a polytope $P$, dual prices for the linear program $\min_{z \in P} \nabla f(x) z$ obtained from linearization at approximately optimal solutions $x$ have a similar interpretation of rate of change in optimal value as for linear programming, providing a convex form of sensitivity analysis. This is of particular interest for Frank--Wolfe algorithms (also called conditional gradients), forming an important class of first-order methods, where a basic building block is linear minimization of gradients of $f$ over $P$, which in most implementations already compute the dual prices as a by-product.

  5. arXiv:1805.07311 (Published 2018-05-18)

    Blended Conditional Gradients: the unconditioning of conditional gradients

    Gábor Braun, Sebastian Pokutta, Dan Tu, Stephen Wright

    We present a blended conditional gradient approach for minimizing a smooth convex function over a polytope $P$, that combines the Frank--Wolfe algorithm (also called conditional gradient) with gradient-based steps different from away steps and pairwise steps, however, still achieving linear convergence for strongly convex functions and good practical performance. Our approach retains all favorable properties of conditional gradient algorithms, most notably avoidance of projections onto $P$ and maintenance of iterates as sparse convex combinations of a limited number of extreme points of $P$. The algorithm decreases measures of optimality (primal and dual gaps) rapidly, both in the number of iterations and in wall-clock time, outperforming even the efficient "lazified" conditional gradient algorithms of [arXiv:1410.8816]. Nota bene the algorithm is lazified itself. We also present a streamlined algorithm when $P$ is the probability simplex.

  6. arXiv:1407.5144 (Published 2014-07-19, updated 2017-05-02)

    Lower Bounds on the Oracle Complexity of Nonsmooth Convex Optimization via Information Theory

    Gábor Braun, Cristóbal Guzmán, Sebastian Pokutta

    We present an information-theoretic approach to lower bound the oracle complexity of nonsmooth black box convex optimization, unifying previous lower bounding techniques by identifying a combinatorial problem, namely string guessing, as a single source of hardness. As a measure of complexity we use distributional oracle complexity, which subsumes randomized oracle complexity as well as worst-case oracle complexity. We obtain strong lower bounds on distributional oracle complexity for the box $[-1,1]^n$, as well as for the $L^p$-ball for $p \geq 1$ (for both low-scale and large-scale regimes), matching worst-case upper bounds, and hence we close the gap between distributional complexity, and in particular, randomized complexity, and worst-case complexity. Furthermore, the bounds remain essentially the same for high-probability and bounded-error oracle complexity, and even for combination of the two, i.e., bounded-error high-probability oracle complexity. This considerably extends the applicability of known bounds.