Search CORE

8,675 research outputs found

Exact Post-Selection Inference for Sequential Regression Procedures

Author: Lockhart Richard
Taylor Jonathan
Tibshirani Robert
Tibshirani Ryan J.
Publication venue
Publication date: 11/10/2015
Field of study

We propose new inference tools for forward stepwise regression, least angle regression, and the lasso. Assuming a Gaussian model for the observation vector y, we first describe a general scheme to perform valid inference after any selection event that can be characterized as y falling into a polyhedral set. This framework allows us to derive conditional (post-selection) hypothesis tests at any step of forward stepwise or least angle regression, or any step along the lasso regularization path, because, as it turns out, selection events for these procedures can be expressed as polyhedral constraints on y. The p-values associated with these tests are exactly uniform under the null distribution, in finite samples, yielding exact type I error control. The tests can also be inverted to produce confidence intervals for appropriate underlying regression parameters. The R package "selectiveInference", freely available on the CRAN repository, implements the new inference tools described in this paper.Comment: 26 pages, 5 figure

arXiv.org e-Print Archive

Crossref

The Francis Crick Institute

A General Framework for Fast Stagewise Algorithms

Author: Tibshirani Ryan J.
Publication venue
Publication date: 13/06/2015
Field of study

Forward stagewise regression follows a very simple strategy for constructing a sequence of sparse regression estimates: it starts with all coefficients equal to zero, and iteratively updates the coefficient (by a small amount

\epsilon

) of the variable that achieves the maximal absolute inner product with the current residual. This procedure has an interesting connection to the lasso: under some conditions, it is known that the sequence of forward stagewise estimates exactly coincides with the lasso path, as the step size

\epsilon

goes to zero. Furthermore, essentially the same equivalence holds outside of least squares regression, with the minimization of a differentiable convex loss function subject to an

\ell_1

norm constraint (the stagewise algorithm now updates the coefficient corresponding to the maximal absolute component of the gradient). Even when they do not match their

\ell_1

-constrained analogues, stagewise estimates provide a useful approximation, and are computationally appealing. Their success in sparse modeling motivates the question: can a simple, effective strategy like forward stagewise be applied more broadly in other regularization settings, beyond the

\ell_1

norm and sparsity? The current paper is an attempt to do just this. We present a general framework for stagewise estimation, which yields fast algorithms for problems such as group-structured learning, matrix completion, image denoising, and more.Comment: 56 pages, 15 figure

arXiv.org e-Print Archive

CiteSeerX

Rejoinder: "A significance test for the lasso"

Author: Lockhart Richard
Taylor Jonathan
Tibshirani Robert
Tibshirani Ryan J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/04/2014
Field of study

Rejoinder of "A significance test for the lasso" by Richard Lockhart, Jonathan Taylor, Ryan J. Tibshirani, Robert Tibshirani [arXiv:1301.7161].Comment: Published in at http://dx.doi.org/10.1214/14-AOS1175REJ the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org). With Correctio

arXiv.org e-Print Archive

Crossref

An Ordered Lasso and Sparse Time-Lagged Regression

Author: Suo Xiaotong
Tibshirani Robert
Publication venue: 'Informa UK Limited'
Publication date: 03/06/2014
Field of study

We consider regression scenarios where it is natural to impose an order constraint on the coefficients. We propose an order-constrained version of L1-regularized regression for this problem, and show how to solve it efficiently using the well-known Pool Adjacent Violators Algorithm as its proximal operator. The main application of this idea is time-lagged regression, where we predict an outcome at time t from features at the previous K time points. In this setting it is natural to assume that the coefficients decay as we move farther away from t, and hence the order constraint is reasonable. Potential applications include financial time series and prediction of dynamic patient out- comes based on clinical measurements. We illustrate this idea on real and simulated data.Comment: 15 pages, 6 figure

arXiv.org e-Print Archive

The Francis Crick Institute