2,589 research outputs found
The Lov\'asz Hinge: A Novel Convex Surrogate for Submodular Losses
Learning with non-modular losses is an important problem when sets of
predictions are made simultaneously. The main tools for constructing convex
surrogate loss functions for set prediction are margin rescaling and slack
rescaling. In this work, we show that these strategies lead to tight convex
surrogates iff the underlying loss function is increasing in the number of
incorrect predictions. However, gradient or cutting-plane computation for these
functions is NP-hard for non-supermodular loss functions. We propose instead a
novel surrogate loss function for submodular losses, the Lov\'asz hinge, which
leads to O(p log p) complexity with O(p) oracle accesses to the loss function
to compute a gradient or cutting-plane. We prove that the Lov\'asz hinge is
convex and yields an extension. As a result, we have developed the first
tractable convex surrogates in the literature for submodular losses. We
demonstrate the utility of this novel convex surrogate through several set
prediction tasks, including on the PASCAL VOC and Microsoft COCO datasets
The Lov\'asz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks
The Jaccard index, also referred to as the intersection-over-union score, is
commonly employed in the evaluation of image segmentation results given its
perceptual qualities, scale invariance - which lends appropriate relevance to
small objects, and appropriate counting of false negatives, in comparison to
per-pixel losses. We present a method for direct optimization of the mean
intersection-over-union loss in neural networks, in the context of semantic
image segmentation, based on the convex Lov\'asz extension of submodular
losses. The loss is shown to perform better with respect to the Jaccard index
measure than the traditionally used cross-entropy loss. We show quantitative
and qualitative differences between optimizing the Jaccard index per image
versus optimizing the Jaccard index taken over an entire dataset. We evaluate
the impact of our method in a semantic segmentation pipeline and show
substantially improved intersection-over-union segmentation scores on the
Pascal VOC and Cityscapes datasets using state-of-the-art deep learning
segmentation architectures.Comment: Accepted as a conference paper at CVPR 201
Learning to Discover Sparse Graphical Models
We consider structure discovery of undirected graphical models from
observational data. Inferring likely structures from few examples is a complex
task often requiring the formulation of priors and sophisticated inference
procedures. Popular methods rely on estimating a penalized maximum likelihood
of the precision matrix. However, in these approaches structure recovery is an
indirect consequence of the data-fit term, the penalty can be difficult to
adapt for domain-specific knowledge, and the inference is computationally
demanding. By contrast, it may be easier to generate training samples of data
that arise from graphs with the desired structure properties. We propose here
to leverage this latter source of information as training data to learn a
function, parametrized by a neural network that maps empirical covariance
matrices to estimated graph structures. Learning this function brings two
benefits: it implicitly models the desired structure or sparsity properties to
form suitable priors, and it can be tailored to the specific problem of edge
structure discovery, rather than maximizing data likelihood. Applying this
framework, we find our learnable graph-discovery method trained on synthetic
data generalizes well: identifying relevant edges in both synthetic and real
data, completely unknown at training time. We find that on genetics, brain
imaging, and simulation data we obtain performance generally superior to
analytical methods
A low variance consistent test of relative dependency
We describe a novel non-parametric statistical hypothesis test of relative
dependence between a source variable and two candidate target variables. Such a
test enables us to determine whether one source variable is significantly more
dependent on a first target variable or a second. Dependence is measured via
the Hilbert-Schmidt Independence Criterion (HSIC), resulting in a pair of
empirical dependence measures (source-target 1, source-target 2). We test
whether the first dependence measure is significantly larger than the second.
Modeling the covariance between these HSIC statistics leads to a provably more
powerful test than the construction of independent HSIC statistics by
sub-sampling. The resulting test is consistent and unbiased, and (being based
on U-statistics) has favorable convergence properties. The test can be computed
in quadratic time, matching the computational complexity of standard empirical
HSIC estimators. The effectiveness of the test is demonstrated on several
real-world problems: we identify language groups from a multilingual corpus,
and we prove that tumor location is more dependent on gene expression than
chromosomal imbalances. Source code is available for download at
https://github.com/wbounliphone/reldep.Comment: International Conference on Machine Learning, Jul 2015, Lille, Franc
- …
