2,347 research outputs found
High-Dimensional Inference with the generalized Hopfield Model: Principal Component Analysis and Corrections
We consider the problem of inferring the interactions between a set of N
binary variables from the knowledge of their frequencies and pairwise
correlations. The inference framework is based on the Hopfield model, a special
case of the Ising model where the interaction matrix is defined through a set
of patterns in the variable space, and is of rank much smaller than N. We show
that Maximum Lik elihood inference is deeply related to Principal Component
Analysis when the amp litude of the pattern components, xi, is negligible
compared to N^1/2. Using techniques from statistical mechanics, we calculate
the corrections to the patterns to the first order in xi/N^1/2. We stress that
it is important to generalize the Hopfield model and include both attractive
and repulsive patterns, to correctly infer networks with sparse and strong
interactions. We present a simple geometrical criterion to decide how many
attractive and repulsive patterns should be considered as a function of the
sampling noise. We moreover discuss how many sampled configurations are
required for a good inference, as a function of the system size, N and of the
amplitude, xi. The inference approach is illustrated on synthetic and
biological data.Comment: Physical Review E: Statistical, Nonlinear, and Soft Matter Physics
(2011) to appea
Elastic-Net Regularization: Error estimates and Active Set Methods
This paper investigates theoretical properties and efficient numerical
algorithms for the so-called elastic-net regularization originating from
statistics, which enforces simultaneously l^1 and l^2 regularization. The
stability of the minimizer and its consistency are studied, and convergence
rates for both a priori and a posteriori parameter choice rules are
established. Two iterative numerical algorithms of active set type are
proposed, and their convergence properties are discussed. Numerical results are
presented to illustrate the features of the functional and algorithms
On the performance of algorithms for the minimization of -penalized functionals
The problem of assessing the performance of algorithms used for the
minimization of an -penalized least-squares functional, for a range of
penalty parameters, is investigated. A criterion that uses the idea of
`approximation isochrones' is introduced. Five different iterative minimization
algorithms are tested and compared, as well as two warm-start strategies. Both
well-conditioned and ill-conditioned problems are used in the comparison, and
the contrast between these two categories is highlighted.Comment: 18 pages, 10 figures; v3: expanded version with an additional
synthetic test problem
Implicitly Constrained Semi-Supervised Least Squares Classification
We introduce a novel semi-supervised version of the least squares classifier.
This implicitly constrained least squares (ICLS) classifier minimizes the
squared loss on the labeled data among the set of parameters implied by all
possible labelings of the unlabeled data. Unlike other discriminative
semi-supervised methods, our approach does not introduce explicit additional
assumptions into the objective function, but leverages implicit assumptions
already present in the choice of the supervised least squares classifier. We
show this approach can be formulated as a quadratic programming problem and its
solution can be found using a simple gradient descent procedure. We prove that,
in a certain way, our method never leads to performance worse than the
supervised classifier. Experimental results corroborate this theoretical result
in the multidimensional case on benchmark datasets, also in terms of the error
rate.Comment: 12 pages, 2 figures, 1 table. The Fourteenth International Symposium
on Intelligent Data Analysis (2015), Saint-Etienne, Franc
Efficient Model Learning for Human-Robot Collaborative Tasks
We present a framework for learning human user models from joint-action
demonstrations that enables the robot to compute a robust policy for a
collaborative task with a human. The learning takes place completely
automatically, without any human intervention. First, we describe the
clustering of demonstrated action sequences into different human types using an
unsupervised learning algorithm. These demonstrated sequences are also used by
the robot to learn a reward function that is representative for each type,
through the employment of an inverse reinforcement learning algorithm. The
learned model is then used as part of a Mixed Observability Markov Decision
Process formulation, wherein the human type is a partially observable variable.
With this framework, we can infer, either offline or online, the human type of
a new user that was not included in the training set, and can compute a policy
for the robot that will be aligned to the preference of this new user and will
be robust to deviations of the human actions from prior demonstrations. Finally
we validate the approach using data collected in human subject experiments, and
conduct proof-of-concept demonstrations in which a person performs a
collaborative task with a small industrial robot
Recommended from our members
Discovery of molecular subtypes in leiomyosarcoma through integrative molecular profiling.
Leiomyosarcoma (LMS) is a soft tissue tumor with a significant degree of morphologic and molecular heterogeneity. We used integrative molecular profiling to discover and characterize molecular subtypes of LMS. Gene expression profiling was performed on 51 LMS samples. Unsupervised clustering showed three reproducible LMS clusters. Array comparative genomic hybridization (aCGH) was performed on 20 LMS samples and showed that the molecular subtypes defined by gene expression showed distinct genomic changes. Tumors from the muscle-enriched cluster showed significantly increased copy number changes (P=0.04). A majority of the muscle-enriched cases showed loss at 16q24, which contains Fanconi anemia, complementation group A, known to have an important role in DNA repair, and loss at 1p36, which contains PRDM16, of which loss promotes muscle differentiation. Immunohistochemistry (IHC) was performed on LMS tissue microarrays (n=377) for five markers with high levels of messenger RNA in the muscle-enriched cluster (ACTG2, CASQ2, SLMAP, CFL2 and MYLK) and showed significantly correlated expression of the five proteins (all pairwise P<0.005). Expression of the five markers was associated with improved disease-specific survival in a multivariate Cox regression analysis (P<0.04). In this analysis that combined gene expression profiling, aCGH and IHC, we characterized distinct molecular LMS subtypes, provided insight into their pathogenesis, and identified prognostic biomarkers
A typical reconstruction limit of compressed sensing based on Lp-norm minimization
We consider the problem of reconstructing an -dimensional continuous
vector \bx from constraints which are generated by its linear
transformation under the assumption that the number of non-zero elements of
\bx is typically limited to (). Problems of this
type can be solved by minimizing a cost function with respect to the -norm
||\bx||_p=\lim_{\epsilon \to +0}\sum_{i=1}^N |x_i|^{p+\epsilon}, subject to
the constraints under an appropriate condition. For several , we assess a
typical case limit , which represents a critical relation
between and for successfully reconstructing the original
vector by minimization for typical situations in the limit
with keeping finite, utilizing the replica method. For ,
is considerably smaller than its worst case counterpart, which
has been rigorously derived by existing literature of information theory.Comment: 12 pages, 2 figure
Differentially Private Model Selection with Penalized and Constrained Likelihood
In statistical disclosure control, the goal of data analysis is twofold: The
released information must provide accurate and useful statistics about the
underlying population of interest, while minimizing the potential for an
individual record to be identified. In recent years, the notion of differential
privacy has received much attention in theoretical computer science, machine
learning, and statistics. It provides a rigorous and strong notion of
protection for individuals' sensitive information. A fundamental question is
how to incorporate differential privacy into traditional statistical inference
procedures. In this paper we study model selection in multivariate linear
regression under the constraint of differential privacy. We show that model
selection procedures based on penalized least squares or likelihood can be made
differentially private by a combination of regularization and randomization,
and propose two algorithms to do so. We show that our private procedures are
consistent under essentially the same conditions as the corresponding
non-private procedures. We also find that under differential privacy, the
procedure becomes more sensitive to the tuning parameters. We illustrate and
evaluate our method using simulation studies and two real data examples
- …
