1,043 research outputs found
Efficient Data Representation by Selecting Prototypes with Importance Weights
Prototypical examples that best summarizes and compactly represents an
underlying complex data distribution communicate meaningful insights to humans
in domains where simple explanations are hard to extract. In this paper we
present algorithms with strong theoretical guarantees to mine these data sets
and select prototypes a.k.a. representatives that optimally describes them. Our
work notably generalizes the recent work by Kim et al. (2016) where in addition
to selecting prototypes, we also associate non-negative weights which are
indicative of their importance. This extension provides a single coherent
framework under which both prototypes and criticisms (i.e. outliers) can be
found. Furthermore, our framework works for any symmetric positive definite
kernel thus addressing one of the key open questions laid out in Kim et al.
(2016). By establishing that our objective function enjoys a key property of
that of weak submodularity, we present a fast ProtoDash algorithm and also
derive approximation guarantees for the same. We demonstrate the efficacy of
our method on diverse domains such as retail, digit recognition (MNIST) and on
publicly available 40 health questionnaires obtained from the Center for
Disease Control (CDC) website maintained by the US Dept. of Health. We validate
the results quantitatively as well as qualitatively based on expert feedback
and recently published scientific studies on public health, thus showcasing the
power of our technique in providing actionability (for retail), utility (for
MNIST) and insight (on CDC datasets) which arguably are the hallmarks of an
effective data mining method.Comment: Accepted for publication in International Conference on Data Mining
(ICDM) 201
Fractal basins of convergence of libration points in the planar Copenhagen problem with a repulsive quasi-homogeneous Manev-type potential
The Newton-Raphson basins of convergence, corresponding to the coplanar
libration points (which act as attractors), are unveiled in the Copenhagen
problem, where instead of the Newtonian potential and forces, a
quasi-homogeneous potential created by two primaries is considered. The
multivariate version of the Newton-Raphson iterative scheme is used to reveal
the attracting domain associated with the libration points on various type of
two-dimensional configuration planes. The correlations between the basins of
convergence and the corresponding required number of iterations are also
presented and discussed in detail. The present numerical analysis reveals that
the evolution of the attracting domains in this dynamical system is very
complicated, however, it is a worth studying issue.Comment: Published in International Journal of Non-Linear Mechanics (IJNLM
- …
