312 research outputs found

    Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition

    Full text link
    This paper is devoted to solving a convex stochastic optimization problem in a overparameterization setup for the case where the original gradient computation is not available, but an objective function value can be computed. For this class of problems we provide a novel gradient-free algorithm, whose creation approach is based on applying a gradient approximation with l2l_2 randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracle returns a noisy (inexact) objective function value. We also perform a detailed analysis to find the maximum admissible level of adversarial noise at which we can guarantee to achieve the desired accuracy. We verify the theoretical results of convergence using a model example

    Acceleration Exists! Optimization Problems When Oracle Can Only Compare Objective Function Values

    Full text link
    Frequently, the burgeoning field of black-box optimization encounters challenges due to a limited understanding of the mechanisms of the objective function. To address such problems, in this work we focus on the deterministic concept of Order Oracle, which only utilizes order access between function values (possibly with some bounded noise), but without assuming access to their values. As theoretical results, we propose a new approach to create non-accelerated optimization algorithms (obtained by integrating Order Oracle into existing optimization "tools") in non-convex, convex, and strongly convex settings that are as good as both SOTA coordinate algorithms with first-order oracle and SOTA algorithms with Order Oracle up to logarithm factor. Moreover, using the proposed approach, we provide the first accelerated optimization algorithm using the Order Oracle. And also, using an already different approach we provide the asymptotic convergence of the first algorithm with the stochastic Order Oracle concept. Finally, our theoretical results demonstrate effectiveness of proposed algorithms through numerical experiments

    Highly Smoothness Zero-Order Methods for Solving Optimization Problems under PL Condition

    Full text link
    In this paper, we study the black box optimization problem under the Polyak--Lojasiewicz (PL) condition, assuming that the objective function is not just smooth, but has higher smoothness. By using "kernel-based" approximation instead of the exact gradient in Stochastic Gradient Descent method, we improve the best known results of convergence in the class of gradient-free algorithms solving problem under PL condition. We generalize our results to the case where a zero-order oracle returns a function value at a point with some adversarial noise. We verify our theoretical results on the example of solving a system of nonlinear equations

    Upper bounds on maximum admissible noise in zeroth-order optimisation

    Full text link
    In this paper, based on information-theoretic upper bound on noise in convex Lipschitz continuous zeroth-order optimisation, we provide corresponding upper bounds for strongly-convex and smooth classes of problems using non-constructive proofs through optimal reductions. Also, we show that based on one-dimensional grid-search optimisation algorithm one can construct algorithm for simplex-constrained optimisation with upper bound on noise better than that for ball-constrained and asymptotic in dimensionality case.Comment: 15 pages, 2 figure

    Non-Smooth Setting of Stochastic Decentralized Convex Optimization Problem Over Time-Varying Graphs

    Full text link
    Distributed optimization has a rich history. It has demonstrated its effectiveness in many machine learning applications, etc. In this paper we study a subclass of distributed optimization, namely decentralized optimization in a non-smooth setting. Decentralized means that mm agents (machines) working in parallel on one problem communicate only with the neighbors agents (machines), i.e. there is no (central) server through which agents communicate. And by non-smooth setting we mean that each agent has a convex stochastic non-smooth function, that is, agents can hold and communicate information only about the value of the objective function, which corresponds to a gradient-free oracle. In this paper, to minimize the global objective function, which consists of the sum of the functions of each agent, we create a gradient-free algorithm by applying a smoothing scheme via l2l_2 randomization. We also verify in experiments the obtained theoretical convergence results of the gradient-free algorithm proposed in this paper.Comment: arXiv admin note: text overlap with arXiv:2106.0446

    Gradient-free algorithm for saddle point problems under overparametrization

    Full text link
    This paper focuses on solving a stochastic saddle point problem (SPP) under an overparameterized regime for the case, when the gradient computation is impractical. As an intermediate step, we generalize Same-sample Stochastic Extra-gradient algorithm (Gorbunov et al., 2022) to a biased oracle and estimate novel convergence rates. As the result of the paper we introduce an algorithm, which uses gradient approximation instead of a gradient oracle. We also conduct an analysis to find the maximum admissible level of adversarial noise and the optimal number of iterations at which our algorithm can guarantee achieving the desired accuracy

    Randomized gradient-free methods in convex optimization

    Get PDF
    This review presents modern gradient-free methods to solve convex optimization problems. By gradient-free methods, we mean those that use only (noisy) realizations of the objective value. We are motivated by various applications where gradient information is prohibitively expensive or even unavailable. We mainly focus on three criteria: oracle complexity, iteration complexity, and the maximum permissible noise level

    Gradient-Free Federated Learning Methods with l1l_1 and l2l_2-Randomization for Non-Smooth Convex Stochastic Optimization Problems

    Full text link
    This paper studies non-smooth problems of convex stochastic optimization. Using the smoothing technique based on the replacement of the function value at the considered point by the averaged function value over a ball (in l1l_1-norm or l2l_2-norm) of small radius with the center in this point, the original problem is reduced to a smooth problem (whose Lipschitz constant of the gradient is inversely proportional to the radius of the ball). An important property of the smoothing used is the possibility to calculate an unbiased estimation of the gradient of a smoothed function based only on realizations of the original function. The obtained smooth stochastic optimization problem is proposed to be solved in a distributed federated learning architecture (the problem is solved in parallel: nodes make local steps, e.g. stochastic gradient descent, then they communicate - all with all, then all this is repeated). The goal of this paper is to build on the current advances in gradient-free non-smooth optimization and in feild of federated learning, gradient-free methods for solving non-smooth stochastic optimization problems in federated learning architecture.Comment: in Russian languag
    corecore