312 research outputs found
Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition
This paper is devoted to solving a convex stochastic optimization problem in
a overparameterization setup for the case where the original gradient
computation is not available, but an objective function value can be computed.
For this class of problems we provide a novel gradient-free algorithm, whose
creation approach is based on applying a gradient approximation with
randomization instead of a gradient oracle in the biased Accelerated SGD
algorithm, which generalizes the convergence results of the AC-SA algorithm to
the case where the gradient oracle returns a noisy (inexact) objective function
value. We also perform a detailed analysis to find the maximum admissible level
of adversarial noise at which we can guarantee to achieve the desired accuracy.
We verify the theoretical results of convergence using a model example
Acceleration Exists! Optimization Problems When Oracle Can Only Compare Objective Function Values
Frequently, the burgeoning field of black-box optimization encounters
challenges due to a limited understanding of the mechanisms of the objective
function. To address such problems, in this work we focus on the deterministic
concept of Order Oracle, which only utilizes order access between function
values (possibly with some bounded noise), but without assuming access to their
values. As theoretical results, we propose a new approach to create
non-accelerated optimization algorithms (obtained by integrating Order Oracle
into existing optimization "tools") in non-convex, convex, and strongly convex
settings that are as good as both SOTA coordinate algorithms with first-order
oracle and SOTA algorithms with Order Oracle up to logarithm factor. Moreover,
using the proposed approach, we provide the first accelerated optimization
algorithm using the Order Oracle. And also, using an already different approach
we provide the asymptotic convergence of the first algorithm with the
stochastic Order Oracle concept. Finally, our theoretical results demonstrate
effectiveness of proposed algorithms through numerical experiments
Highly Smoothness Zero-Order Methods for Solving Optimization Problems under PL Condition
In this paper, we study the black box optimization problem under the
Polyak--Lojasiewicz (PL) condition, assuming that the objective function is not
just smooth, but has higher smoothness. By using "kernel-based" approximation
instead of the exact gradient in Stochastic Gradient Descent method, we improve
the best known results of convergence in the class of gradient-free algorithms
solving problem under PL condition. We generalize our results to the case where
a zero-order oracle returns a function value at a point with some adversarial
noise. We verify our theoretical results on the example of solving a system of
nonlinear equations
Upper bounds on maximum admissible noise in zeroth-order optimisation
In this paper, based on information-theoretic upper bound on noise in convex
Lipschitz continuous zeroth-order optimisation, we provide corresponding upper
bounds for strongly-convex and smooth classes of problems using
non-constructive proofs through optimal reductions. Also, we show that based on
one-dimensional grid-search optimisation algorithm one can construct algorithm
for simplex-constrained optimisation with upper bound on noise better than that
for ball-constrained and asymptotic in dimensionality case.Comment: 15 pages, 2 figure
Non-Smooth Setting of Stochastic Decentralized Convex Optimization Problem Over Time-Varying Graphs
Distributed optimization has a rich history. It has demonstrated its
effectiveness in many machine learning applications, etc. In this paper we
study a subclass of distributed optimization, namely decentralized optimization
in a non-smooth setting. Decentralized means that agents (machines) working
in parallel on one problem communicate only with the neighbors agents
(machines), i.e. there is no (central) server through which agents communicate.
And by non-smooth setting we mean that each agent has a convex stochastic
non-smooth function, that is, agents can hold and communicate information only
about the value of the objective function, which corresponds to a gradient-free
oracle. In this paper, to minimize the global objective function, which
consists of the sum of the functions of each agent, we create a gradient-free
algorithm by applying a smoothing scheme via randomization. We also
verify in experiments the obtained theoretical convergence results of the
gradient-free algorithm proposed in this paper.Comment: arXiv admin note: text overlap with arXiv:2106.0446
Gradient-free algorithm for saddle point problems under overparametrization
This paper focuses on solving a stochastic saddle point problem (SPP) under
an overparameterized regime for the case, when the gradient computation is
impractical. As an intermediate step, we generalize Same-sample Stochastic
Extra-gradient algorithm (Gorbunov et al., 2022) to a biased oracle and
estimate novel convergence rates. As the result of the paper we introduce an
algorithm, which uses gradient approximation instead of a gradient oracle. We
also conduct an analysis to find the maximum admissible level of adversarial
noise and the optimal number of iterations at which our algorithm can guarantee
achieving the desired accuracy
Randomized gradient-free methods in convex optimization
This review presents modern gradient-free methods to solve convex optimization problems. By gradient-free methods, we mean those that use only (noisy) realizations of the objective value. We are motivated by various applications where gradient information is prohibitively expensive or even unavailable. We mainly focus on three criteria: oracle complexity, iteration complexity, and the maximum permissible noise level
Gradient-Free Federated Learning Methods with and -Randomization for Non-Smooth Convex Stochastic Optimization Problems
This paper studies non-smooth problems of convex stochastic optimization.
Using the smoothing technique based on the replacement of the function value at
the considered point by the averaged function value over a ball (in -norm
or -norm) of small radius with the center in this point, the original
problem is reduced to a smooth problem (whose Lipschitz constant of the
gradient is inversely proportional to the radius of the ball). An important
property of the smoothing used is the possibility to calculate an unbiased
estimation of the gradient of a smoothed function based only on realizations of
the original function. The obtained smooth stochastic optimization problem is
proposed to be solved in a distributed federated learning architecture (the
problem is solved in parallel: nodes make local steps, e.g. stochastic gradient
descent, then they communicate - all with all, then all this is repeated). The
goal of this paper is to build on the current advances in gradient-free
non-smooth optimization and in feild of federated learning, gradient-free
methods for solving non-smooth stochastic optimization problems in federated
learning architecture.Comment: in Russian languag
- …
