Search CORE

12 research outputs found

Explicit Stabilised Gradient Descent for Faster Strongly Convex Optimisation

Author: A Abdulle
A Abdulle
A Abdulle
A Abdulle
A Abdulle
A Wibisono
AC Wilson
AS Vasudeva
BP Sommeijer
CJ Zbinden
D Scieur
E Hairer
E Hairer
H Zou
L Lessard
M Hochbruck
N Parikh
PJ van der Houwen
V Druskin
W Su
Y Nesterov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/06/2020
Field of study

This paper introduces the Runge-Kutta Chebyshev descent method (RKCD) for strongly convex optimisation problems. This new algorithm is based on explicit stabilised integrators for stiff differential equations, a powerful class of numerical schemes that avoid the severe step size restriction faced by standard explicit integrators. For optimising quadratic and strongly convex functions, this paper proves that RKCD nearly achieves the optimal convergence rate of the conjugate gradient algorithm, and the suboptimality of RKCD diminishes as the condition number of the quadratic function worsens. It is established that this optimal rate is obtained also for a partitioned variant of RKCD applied to perturbations of quadratic functions. In addition, numerical experiments on general strongly convex problems show that RKCD outperforms Nesterov's accelerated gradient descent

arXiv.org e-Print Archive

Crossref

Publikationer från Umeå universitet

Edinburgh Research Explorer

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swepub

Archive ouverte UNIGE

On the Asymptotic Linear Convergence Speed of Anderson Acceleration, Nesterov Acceleration, and Nonlinear GMRES

Author: Nesterov Y.
Scieur D.
Washio T.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date
Field of study

Crossref

Accélération Non-linéaire des Réseaux de Neurones Profonds

Author: Bach Francis
D 'Aspremont Alexandre
Oyallon Edouard
Scieur Damien
Publication venue: HAL CCSD
Publication date: 24/05/2018
Field of study

Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods, with marginal computational overhead. It aims to improve convergence using only the iterates of simple iterative algorithms. However, so far its application to optimization was theoretically limited to gradient descent and other single-step algorithms. Here, we adapt RNA to a much broader setting including stochastic gradient with momentum and Nesterov's fast gradient. We use it to train deep neural networks, and empirically observe that extrapolated networks are more accurate, especially in the early iterations. A straightforward application of our algorithm when training ResNet-152 on ImageNet produces a top-1 test error of 20.88%, improving by 0.8% the reference classification pipeline. Furthermore, the code runs offline in this case, so it never negatively affects performance

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Portail HAL UNIV-RENNES