Search CORE

312 research outputs found

Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition

Author: Gasnikov Alexander
Lobanov Aleksandr
Publication venue
Publication date: 24/07/2023
Field of study

This paper is devoted to solving a convex stochastic optimization problem in a overparameterization setup for the case where the original gradient computation is not available, but an objective function value can be computed. For this class of problems we provide a novel gradient-free algorithm, whose creation approach is based on applying a gradient approximation with

l_2

randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracle returns a noisy (inexact) objective function value. We also perform a detailed analysis to find the maximum admissible level of adversarial noise at which we can guarantee to achieve the desired accuracy. We verify the theoretical results of convergence using a model example

arXiv.org e-Print Archive

Acceleration Exists! Optimization Problems When Oracle Can Only Compare Objective Function Values

Author: Gasnikov Alexander
Krasnov Andrei
Lobanov Aleksandr
Publication venue
Publication date: 24/05/2024
Field of study

Frequently, the burgeoning field of black-box optimization encounters challenges due to a limited understanding of the mechanisms of the objective function. To address such problems, in this work we focus on the deterministic concept of Order Oracle, which only utilizes order access between function values (possibly with some bounded noise), but without assuming access to their values. As theoretical results, we propose a new approach to create non-accelerated optimization algorithms (obtained by integrating Order Oracle into existing optimization "tools") in non-convex, convex, and strongly convex settings that are as good as both SOTA coordinate algorithms with first-order oracle and SOTA algorithms with Order Oracle up to logarithm factor. Moreover, using the proposed approach, we provide the first accelerated optimization algorithm using the Order Oracle. And also, using an already different approach we provide the asymptotic convergence of the first algorithm with the stochastic Order Oracle concept. Finally, our theoretical results demonstrate effectiveness of proposed algorithms through numerical experiments

arXiv.org e-Print Archive

Highly Smoothness Zero-Order Methods for Solving Optimization Problems under PL Condition

Author: Gasnikov Alexander
Lobanov Aleksandr
Stonyakin Fedor
Publication venue
Publication date: 28/11/2023
Field of study

In this paper, we study the black box optimization problem under the Polyak--Lojasiewicz (PL) condition, assuming that the objective function is not just smooth, but has higher smoothness. By using "kernel-based" approximation instead of the exact gradient in Stochastic Gradient Descent method, we improve the best known results of convergence in the class of gradient-free algorithms solving problem under PL condition. We generalize our results to the case where a zero-order oracle returns a function value at a point with some adversarial noise. We verify our theoretical results on the example of solving a system of nonlinear equations

arXiv.org e-Print Archive

Upper bounds on maximum admissible noise in zeroth-order optimisation

Author: Gasnikov Alexander
Lobanov Aleksandr
Pasechnyuk Dmitry A.
Publication venue
Publication date: 28/06/2023
Field of study

In this paper, based on information-theoretic upper bound on noise in convex Lipschitz continuous zeroth-order optimisation, we provide corresponding upper bounds for strongly-convex and smooth classes of problems using non-constructive proofs through optimal reductions. Also, we show that based on one-dimensional grid-search optimisation algorithm one can construct algorithm for simplex-constrained optimisation with upper bound on noise better than that for ball-constrained and asymptotic in dimensionality case.Comment: 15 pages, 2 figure

arXiv.org e-Print Archive

Non-Smooth Setting of Stochastic Decentralized Convex Optimization Problem Over Time-Varying Graphs

Author: Beznosikov Aleksandr
Gasnikov Alexander
Konin Georgiy
Kovalev Dmitry
Lobanov Aleksandr
Veprikov Andrew
Publication venue
Publication date: 05/09/2023
Field of study

Distributed optimization has a rich history. It has demonstrated its effectiveness in many machine learning applications, etc. In this paper we study a subclass of distributed optimization, namely decentralized optimization in a non-smooth setting. Decentralized means that

m

agents (machines) working in parallel on one problem communicate only with the neighbors agents (machines), i.e. there is no (central) server through which agents communicate. And by non-smooth setting we mean that each agent has a convex stochastic non-smooth function, that is, agents can hold and communicate information only about the value of the objective function, which corresponds to a gradient-free oracle. In this paper, to minimize the global objective function, which consists of the sum of the functions of each agent, we create a gradient-free algorithm by applying a smoothing scheme via

l_2

randomization. We also verify in experiments the obtained theoretical convergence results of the gradient-free algorithm proposed in this paper.Comment: arXiv admin note: text overlap with arXiv:2106.0446

arXiv.org e-Print Archive

Gradient-free algorithm for saddle point problems under overparametrization

Author: Bondar Sofiya
Dvinskikh Darina
Gasnikov Alexander
Lobanov Aleksandr
Statkevich Ekaterina
Publication venue
Publication date: 04/06/2024
Field of study

This paper focuses on solving a stochastic saddle point problem (SPP) under an overparameterized regime for the case, when the gradient computation is impractical. As an intermediate step, we generalize Same-sample Stochastic Extra-gradient algorithm (Gorbunov et al., 2022) to a biased oracle and estimate novel convergence rates. As the result of the paper we introduce an algorithm, which uses gradient approximation instead of a gradient oracle. We also conduct an analysis to find the maximum admissible level of adversarial noise and the optimal number of iterations at which our algorithm can guarantee achieving the desired accuracy

arXiv.org e-Print Archive

Randomized gradient-free methods in convex optimization

Author: Beznosikov Aleksander
Dvinskikh Darina
Dvurechensky Pavel
Gasnikov Alexander
Gorbunov Eduard
Lobanov Aleksandr
Publication venue: arXiv
Publication date: 01/01/2022
Field of study

This review presents modern gradient-free methods to solve convex optimization problems. By gradient-free methods, we mean those that use only (noisy) realizations of the objective value. We are motivated by various applications where gradient information is prohibitively expensive or even unavailable. We mainly focus on three criteria: oracle complexity, iteration complexity, and the maximum permissible noise level

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Gradient-Free Federated Learning Methods with $l_1$ and $l_2$ -Randomization for Non-Smooth Convex Stochastic Optimization Problems

Author: Alashqar Belal
Dvinskikh Darina
Gasnikov Alexander
Lobanov Aleksandr
Publication venue
Publication date: 25/11/2022
Field of study

This paper studies non-smooth problems of convex stochastic optimization. Using the smoothing technique based on the replacement of the function value at the considered point by the averaged function value over a ball (in

l_1

-norm or

l_2

-norm) of small radius with the center in this point, the original problem is reduced to a smooth problem (whose Lipschitz constant of the gradient is inversely proportional to the radius of the ball). An important property of the smoothing used is the possibility to calculate an unbiased estimation of the gradient of a smoothed function based only on realizations of the original function. The obtained smooth stochastic optimization problem is proposed to be solved in a distributed federated learning architecture (the problem is solved in parallel: nodes make local steps, e.g. stochastic gradient descent, then they communicate - all with all, then all this is repeated). The goal of this paper is to build on the current advances in gradient-free non-smooth optimization and in feild of federated learning, gradient-free methods for solving non-smooth stochastic optimization problems in federated learning architecture.Comment: in Russian languag

arXiv.org e-Print Archive