Search CORE

735 research outputs found

Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization'' by V. Koltchinskii

Author: Tsybakov A. B.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/08/2007
Field of study

Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization'' by V. Koltchinskii [arXiv:0708.0083]Comment: Published at http://dx.doi.org/10.1214/009053606000001064 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Exponential Screening and optimal rates of sparse estimation

Author: Rigollet Philippe
Tsybakov Alexandre
Publication venue
Publication date: 27/07/2010
Field of study

In high-dimensional linear regression, the goal pursued here is to estimate an unknown regression function using linear combinations of a suitable set of covariates. One of the key assumptions for the success of any statistical procedure in this setup is to assume that the linear combination is sparse in some sense, for example, that it involves only few covariates. We consider a general, non necessarily linear, regression with Gaussian noise and study a related question that is to find a linear combination of approximating functions, which is at the same time sparse and has small mean squared error (MSE). We introduce a new estimation procedure, called Exponential Screening that shows remarkable adaptation properties. It adapts to the linear combination that optimally balances MSE and sparsity, whether the latter is measured in terms of the number of non-zero entries in the combination (

\ell_0

norm) or in terms of the global weight of the combination (

\ell_1

norm). The power of this adaptation result is illustrated by showing that Exponential Screening solves optimally and simultaneously all the problems of aggregation in Gaussian regression that have been discussed in the literature. Moreover, we show that the performance of the Exponential Screening estimator cannot be improved in a minimax sense, even if the optimal sparsity is known in advance. The theoretical and numerical superiority of Exponential Screening compared to state-of-the-art sparse procedures is also discussed

arXiv.org e-Print Archive

Hal-Diderot

Discussion: Latent variable graphical model selection via convex optimization

Author: Giraud Christophe
Tsybakov Alexandre
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/08/2012
Field of study

Discussion of "Latent variable graphical model selection via convex optimization" by Venkat Chandrasekaran, Pablo A. Parrilo and Alan S. Willsky [arXiv:1008.1290].Comment: Published in at http://dx.doi.org/10.1214/12-AOS984 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

On Walsh code assignment

Author: Tsybakov A. B.
Tsybakov B. S.
Publication venue
Publication date: 01/10/2012
Field of study

The paper considers the problem of orthogonal variable spreading Walsh-code assignments. The aim of the paper is to provide assignments that can avoid both complicated signaling from the BS to the users and blind rate and code detection amongst a great number of possible codes. The assignments considered here use a partition of all users into several pools. Each pool can use its own codes that are different for different pools. Each user has only a few codes assigned to it within the pool. We state the problem as a combinatorial one expressed in terms of a binary n x k matrix M where is the number n of users, and k is the number of Walsh codes in the pool. A solution to the problem is given as a construction of M, which has the assignment property defined in the paper. Two constructions of such M are presented under different conditions on n and k. The first construction is optimal in the sense that it gives the minimal number of Walsh codes assigned to each user for given n and k. The optimality follows from a proved necessary condition for the existence of M with the assignment property. In addition, we propose a simple algorithm of optimal assignment for the first construction

arXiv.org e-Print Archive

Crossref

Estimation of high-dimensional low-rank matrices

Author: Rohde Angelika
Tsybakov Alexandre B.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 13/05/2011
Field of study

Suppose that we observe entries or, more generally, linear combinations of entries of an unknown

m\times T

-matrix

A

corrupted by noise. We are particularly interested in the high-dimensional setting where the number

mT

of unknown entries can be much larger than the sample size

N

. Motivated by several applications, we consider estimation of matrix

A

under the assumption that it has small rank. This can be viewed as dimension reduction or sparsity assumption. In order to shrink toward a low-rank representation, we investigate penalized least squares estimators with a Schatten-

p

quasi-norm penalty term,

p\leq1

. We study these estimators under two possible assumptions---a modified version of the restricted isometry condition and a uniform bound on the ratio "empirical norm induced by the sampling operator/Frobenius norm." The main results are stated as nonasymptotic upper bounds on the prediction risk and on the Schatten-

q

risk of the estimators, where

q\in[p,2]

. The rates that we obtain for the prediction risk are of the form

rm/N

(for

m=T

), up to logarithmic factors, where

r

is the rank of

A

. The particular examples of multi-task learning and matrix completion are worked out in detail. The proofs are based on tools from the theory of empirical processes. As a by-product, we derive bounds for the

k

th entropy numbers of the quasi-convex Schatten class embeddings

S_p^M\hookrightarrow S_2^M

p<1

, which are of independent interest.Comment: Published in at http://dx.doi.org/10.1214/10-AOS860 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Estimation of matrices with row sparsity

Author: Klopp O.
Tsybakov A. B.
Publication venue
Publication date: 01/01/2015
Field of study

An increasing number of applications is concerned with recovering a sparse matrix from noisy observations. In this paper, we consider the setting where each row of the unknown matrix is sparse. We establish minimax optimal rates of convergence for estimating matrices with row sparsity. A major focus in the present paper is on the derivation of lower bounds

arXiv.org e-Print Archive

HAL-Polytechnique

Fast learning rates for plug-in classifiers under the margin condition

Author: Audibert Jean-Yves
Tsybakov Alexandre B.
Publication venue
Publication date: 24/05/2011
Field of study

It has been recently shown that, under the margin (or low noise) assumption, there exist classifiers attaining fast rates of convergence of the excess Bayes risk, i.e., the rates faster than

n^{-1/2}

. The works on this subject suggested the following two conjectures: (i) the best achievable fast rate is of the order

n^{-1}

, and (ii) the plug-in classifiers generally converge slower than the classifiers based on empirical risk minimization. We show that both conjectures are not correct. In particular, we construct plug-in classifiers that can achieve not only the fast, but also the {\it super-fast} rates, i.e., the rates faster than

n^{-1}

. We establish minimax lower bounds showing that the obtained rates cannot be improved.Comment: 36 page

arXiv.org e-Print Archive

Hal-Diderot

HAL-Ecole des Ponts ParisTech

Statistical inference in compound functional models

Author: Dalalyan Arnak
Ingster Yuri
Tsybakov Alexandre
Publication venue
Publication date: 27/08/2012
Field of study

We consider a general nonparametric regression model called the compound model. It includes, as special cases, sparse additive regression and nonparametric (or linear) regression with many covariates but possibly a small number of relevant covariates. The compound model is characterized by three main parameters: the structure parameter describing the "macroscopic" form of the compound function, the "microscopic" sparsity parameter indicating the maximal number of relevant covariates in each component and the usual smoothness parameter corresponding to the complexity of the members of the compound. We find non-asymptotic minimax rate of convergence of estimators in such a model as a function of these three parameters. We also show that this rate can be attained in an adaptive way

arXiv.org e-Print Archive

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL-Polytechnique