Search CORE

2,347 research outputs found

High-Dimensional Inference with the generalized Hopfield Model: Principal Component Analysis and Corrections

Author: A. Engel
D. J. Amit
I. M. Johnstone
I. T. Jolliffe
R. Monasson
R. Tibshirani
S. Cocco
T. Hastie
V. Sessak
Z. Bai
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2011
Field of study

We consider the problem of inferring the interactions between a set of N binary variables from the knowledge of their frequencies and pairwise correlations. The inference framework is based on the Hopfield model, a special case of the Ising model where the interaction matrix is defined through a set of patterns in the variable space, and is of rank much smaller than N. We show that Maximum Lik elihood inference is deeply related to Principal Component Analysis when the amp litude of the pattern components, xi, is negligible compared to N^1/2. Using techniques from statistical mechanics, we calculate the corrections to the patterns to the first order in xi/N^1/2. We stress that it is important to generalize the Hopfield model and include both attractive and repulsive patterns, to correctly infer networks with sparse and strong interactions. We present a simple geometrical criterion to decide how many attractive and repulsive patterns should be considered as a function of the sampling noise. We moreover discuss how many sampled configurations are required for a good inference, as a function of the system size, N and of the amplitude, xi. The inference approach is illustrated on synthetic and biological data.Comment: Physical Review E: Statistical, Nonlinear, and Soft Matter Physics (2011) to appea

arXiv.org e-Print Archive

Crossref

Hal-Diderot

Elastic-Net Regularization: Error estimates and Active Set Methods

Author: Attouch H
Bangti Jin
Bonesky T
Burger M
Dirk A Lorenz
Engl H W
Grasmair M
Griesse R
Jin B Zou J
Lee H
Loris I
Rockafellar R T
Stefan Schiffler
Tibshirani R
Publication venue: 'IOP Publishing'
Publication date: 01/01/2009
Field of study

This paper investigates theoretical properties and efficient numerical algorithms for the so-called elastic-net regularization originating from statistics, which enforces simultaneously l^1 and l^2 regularization. The stability of the minimizer and its consistency are studied, and convergence rates for both a priori and a posteriori parameter choice rules are established. Two iterative numerical algorithms of active set type are proposed, and their convergence properties are discussed. Numerical results are presented to illustrate the features of the functional and algorithms

arXiv.org e-Print Archive

Crossref

UCL Discovery

On the performance of algorithms for the minimization of $\ell_1$ -penalized functionals

Author: Beck A
Donoho D Stodden V Tsaig Y
Figueiredo M A T Nowak R D Wright S J
Hale E T Yin W Zhang Y
Ignace Loris
Sjöstrand K
Tibshirani R
Publication venue: 'IOP Publishing'
Publication date: 01/01/2009
Field of study

The problem of assessing the performance of algorithms used for the minimization of an

\ell_1

-penalized least-squares functional, for a range of penalty parameters, is investigated. A criterion that uses the idea of `approximation isochrones' is introduced. Five different iterative minimization algorithms are tested and compared, as well as two warm-start strategies. Both well-conditioned and ill-conditioned problems are used in the comparison, and the contrast between these two categories is highlighted.Comment: 18 pages, 10 figures; v3: expanded version with an additional synthetic test problem

arXiv.org e-Print Archive

Crossref

DI-fusion

Implicitly Constrained Semi-Supervised Least Squares Classification

Author: B Widrow
GJ McLachlan
K Nigam
KP Bennett
L Bottou
M Loog
M Loog
M Opper
O Chapelle
R Rifkin
R Tibshirani
RH Byrd
S Raudys
T Hastie
T Poggio
X Zhu
YF Li
Publication venue
Publication date: 24/07/2015
Field of study

We introduce a novel semi-supervised version of the least squares classifier. This implicitly constrained least squares (ICLS) classifier minimizes the squared loss on the labeled data among the set of parameters implied by all possible labelings of the unlabeled data. Unlike other discriminative semi-supervised methods, our approach does not introduce explicit additional assumptions into the objective function, but leverages implicit assumptions already present in the choice of the supervised least squares classifier. We show this approach can be formulated as a quadratic programming problem and its solution can be found using a simple gradient descent procedure. We prove that, in a certain way, our method never leads to performance worse than the supervised classifier. Experimental results corroborate this theoretical result in the multidimensional case on benchmark datasets, also in terms of the error rate.Comment: 12 pages, 2 figures, 1 table. The Fourteenth International Symposium on Intelligent Data Analysis (2015), Saint-Etienne, Franc

arXiv.org e-Print Archive

Crossref

Efficient Model Learning for Human-Robot Collaborative Tasks

Author: Atkeson C. G.
Chernova S.
Jaaskinen V.
Kurniawati H.
Macindoe O.
Marden J. I.
Nguyen T.-H. D.
Ong S. C.
Pineau J.
Syed U.
Tibshirani R.
Waugh K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/05/2014
Field of study

We present a framework for learning human user models from joint-action demonstrations that enables the robot to compute a robust policy for a collaborative task with a human. The learning takes place completely automatically, without any human intervention. First, we describe the clustering of demonstrated action sequences into different human types using an unsupervised learning algorithm. These demonstrated sequences are also used by the robot to learn a reward function that is representative for each type, through the employment of an inverse reinforcement learning algorithm. The learned model is then used as part of a Mixed Observability Markov Decision Process formulation, wherein the human type is a partially observable variable. With this framework, we can infer, either offline or online, the human type of a new user that was not included in the training set, and can compute a policy for the robot that will be aligned to the preference of this new user and will be robust to deviations of the human actions from prior demonstrations. Finally we validate the approach using data collected in human subject experiments, and conduct proof-of-concept demonstrations in which a person performs a collaborative task with a small industrial robot

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Recommended from our members

Discovery of molecular subtypes in leiomyosarcoma through integrative molecular profiling.

Author: Beck A
Edris B
Espinosa I
Fletcher C
Gleason B
Hastie T
Jablons David
Lee C-H
Li R
Marinelli R
Montgomery K
Rubin B
Tibshirani R
van de Rijn M
West R
Witten D
Zhu S
Publication venue: eScholarship, University of California
Publication date: 11/02/2010
Field of study

Leiomyosarcoma (LMS) is a soft tissue tumor with a significant degree of morphologic and molecular heterogeneity. We used integrative molecular profiling to discover and characterize molecular subtypes of LMS. Gene expression profiling was performed on 51 LMS samples. Unsupervised clustering showed three reproducible LMS clusters. Array comparative genomic hybridization (aCGH) was performed on 20 LMS samples and showed that the molecular subtypes defined by gene expression showed distinct genomic changes. Tumors from the muscle-enriched cluster showed significantly increased copy number changes (P=0.04). A majority of the muscle-enriched cases showed loss at 16q24, which contains Fanconi anemia, complementation group A, known to have an important role in DNA repair, and loss at 1p36, which contains PRDM16, of which loss promotes muscle differentiation. Immunohistochemistry (IHC) was performed on LMS tissue microarrays (n=377) for five markers with high levels of messenger RNA in the muscle-enriched cluster (ACTG2, CASQ2, SLMAP, CFL2 and MYLK) and showed significantly correlated expression of the five proteins (all pairwise P<0.005). Expression of the five markers was associated with improved disease-specific survival in a multivariate Cox regression analysis (P<0.04). In this analysis that combined gene expression profiling, aCGH and IHC, we characterized distinct molecular LMS subtypes, provided insight into their pathogenesis, and identified prognostic biomarkers

eScholarship - University of California

A typical reconstruction limit of compressed sensing based on Lp-norm minimization

Author: Boyd S
de Almeida J R L
Dotsenko V S
Gilbert E N
Grant M Boyd S
Mézard M
Rangan S Fletcher A K Goyal V K
Shannon C E
Shannon C E
T Tanaka
T Wadayama
Takeda K
Talagrand M
Tibshirani R
Y Kabashima
Publication venue: 'IOP Publishing'
Publication date: 05/12/2009
Field of study

We consider the problem of reconstructing an

N

-dimensional continuous vector \bx from

P

constraints which are generated by its linear transformation under the assumption that the number of non-zero elements of \bx is typically limited to

\rho N

(

0\le \rho \le 1

). Problems of this type can be solved by minimizing a cost function with respect to the

L_p

-norm ||\bx||_p=\lim_{\epsilon \to +0}\sum_{i=1}^N |x_i|^{p+\epsilon}, subject to the constraints under an appropriate condition. For several

p

, we assess a typical case limit

\alpha_c(\rho)

, which represents a critical relation between

\alpha=P/N

and

\rho

for successfully reconstructing the original vector by minimization for typical situations in the limit

N,P \to \infty

with keeping

\alpha

finite, utilizing the replica method. For

p=1

\alpha_c(\rho)

is considerably smaller than its worst case counterpart, which has been rigorously derived by existing literature of information theory.Comment: 12 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Differentially Private Model Selection with Penalized and Constrained Likelihood

Author: Chaudhuri K.
Chaudhuri K.
Chaudhuri K.
Dalenius T.
Duchi J. C.
Fienberg S.
Gaboardi M.
Hardt M.
Lei J.
Rubin D. B.
Smith A.
Tibshirani R.
Uhler C.
Publication venue
Publication date: 14/07/2016
Field of study

In statistical disclosure control, the goal of data analysis is twofold: The released information must provide accurate and useful statistics about the underlying population of interest, while minimizing the potential for an individual record to be identified. In recent years, the notion of differential privacy has received much attention in theoretical computer science, machine learning, and statistics. It provides a rigorous and strong notion of protection for individuals' sensitive information. A fundamental question is how to incorporate differential privacy into traditional statistical inference procedures. In this paper we study model selection in multivariate linear regression under the constraint of differential privacy. We show that model selection procedures based on penalized least squares or likelihood can be made differentially private by a combination of regularization and randomization, and propose two algorithms to do so. We show that our private procedures are consistent under essentially the same conditions as the corresponding non-private procedures. We also find that under differential privacy, the procedure becomes more sensitive to the tuning parameters. We illustrate and evaluate our method using simulation studies and two real data examples

arXiv.org e-Print Archive

Crossref

Boston University Institutional Repository (OpenBU)