357 research outputs found
Generalization properties of finite size polynomial Support Vector Machines
The learning properties of finite size polynomial Support Vector Machines are
analyzed in the case of realizable classification tasks. The normalization of
the high order features acts as a squeezing factor, introducing a strong
anisotropy in the patterns distribution in feature space. As a function of the
training set size, the corresponding generalization error presents a crossover,
more or less abrupt depending on the distribution's anisotropy and on the task
to be learned, between a fast-decreasing and a slowly decreasing regime. This
behaviour corresponds to the stepwise decrease found by Dietrich et al.[Phys.
Rev. Lett. 82 (1999) 2975-2978] in the thermodynamic limit. The theoretical
results are in excellent agreement with the numerical simulations.Comment: 12 pages, 7 figure
Retarded Learning: Rigorous Results from Statistical Mechanics
We study learning of probability distributions characterized by an unknown
symmetry direction. Based on an entropic performance measure and the
variational method of statistical mechanics we develop exact upper and lower
bounds on the scaled critical number of examples below which learning of the
direction is impossible. The asymptotic tightness of the bounds suggests an
asymptotically optimal method for learning nonsmooth distributions.Comment: 8 pages, 1 figur
Statistical mechanics of random two-player games
Using methods from the statistical mechanics of disordered systems we analyze
the properties of bimatrix games with random payoffs in the limit where the
number of pure strategies of each player tends to infinity. We analytically
calculate quantities such as the number of equilibrium points, the expected
payoff, and the fraction of strategies played with non-zero probability as a
function of the correlation between the payoff matrices of both players and
compare the results with numerical simulations.Comment: 16 pages, 6 figures, for further information see
http://itp.nat.uni-magdeburg.de/~jberg/games.htm
Storage of correlated patterns in a perceptron
We calculate the storage capacity of a perceptron for correlated gaussian
patterns. We find that the storage capacity can be less than 2 if
similar patterns are mapped onto different outputs and vice versa. As long as
the patterns are in general position we obtain, in contrast to previous works,
that in agreement with Cover's theorem. Numerical simulations
confirm the results.Comment: 9 pages LaTeX ioplppt style, figures included using eps
An information theoretic approach to statistical dependence: copula information
We discuss the connection between information and copula theories by showing
that a copula can be employed to decompose the information content of a
multivariate distribution into marginal and dependence components, with the
latter quantified by the mutual information. We define the information excess
as a measure of deviation from a maximum entropy distribution. The idea of
marginal invariant dependence measures is also discussed and used to show that
empirical linear correlation underestimates the amplitude of the actual
correlation in the case of non-Gaussian marginals. The mutual information is
shown to provide an upper bound for the asymptotic empirical log-likelihood of
a copula. An analytical expression for the information excess of T-copulas is
provided, allowing for simple model identification within this family. We
illustrate the framework in a financial data set.Comment: to appear in Europhysics Letter
Perceptron capacity revisited: classification ability for correlated patterns
In this paper, we address the problem of how many randomly labeled patterns
can be correctly classified by a single-layer perceptron when the patterns are
correlated with each other. In order to solve this problem, two analytical
schemes are developed based on the replica method and Thouless-Anderson-Palmer
(TAP) approach by utilizing an integral formula concerning random rectangular
matrices. The validity and relevance of the developed methodologies are shown
for one known result and two example problems. A message-passing algorithm to
perform the TAP scheme is also presented
Multifractality and percolation in the coupling space of perceptrons
The coupling space of perceptrons with continuous as well as with binary
weights gets partitioned into a disordered multifractal by a set of random input patterns. The multifractal spectrum can be
calculated analytically using the replica formalism. The storage capacity and
the generalization behaviour of the perceptron are shown to be related to
properties of which are correctly described within the replica
symmetric ansatz. Replica symmetry breaking is interpreted geometrically as a
transition from percolating to non-percolating cells. The existence of empty
cells gives rise to singularities in the multifractal spectrum. The analytical
results for binary couplings are corroborated by numerical studies.Comment: 13 pages, revtex, 4 eps figures, version accepted for publication in
Phys. Rev.
Entropy and typical properties of Nash equilibria in two-player games
We use techniques from the statistical mechanics of disordered systems to
analyse the properties of Nash equilibria of bimatrix games with large random
payoff matrices. By means of an annealed bound, we calculate their number and
analyse the properties of typical Nash equilibria, which are exponentially
dominant in number. We find that a randomly chosen equilibrium realizes almost
always equal payoffs to either player. This value and the fraction of
strategies played at an equilibrium point are calculated as a function of the
correlation between the two payoff matrices. The picture is complemented by the
calculation of the properties of Nash equilibria in pure strategies.Comment: 6 pages, was "Self averaging of Nash equilibria in two player games",
main section rewritten, some new results, for additional information see
http://itp.nat.uni-magdeburg.de/~jberg/games.htm
Implicitly Constrained Semi-Supervised Least Squares Classification
We introduce a novel semi-supervised version of the least squares classifier.
This implicitly constrained least squares (ICLS) classifier minimizes the
squared loss on the labeled data among the set of parameters implied by all
possible labelings of the unlabeled data. Unlike other discriminative
semi-supervised methods, our approach does not introduce explicit additional
assumptions into the objective function, but leverages implicit assumptions
already present in the choice of the supervised least squares classifier. We
show this approach can be formulated as a quadratic programming problem and its
solution can be found using a simple gradient descent procedure. We prove that,
in a certain way, our method never leads to performance worse than the
supervised classifier. Experimental results corroborate this theoretical result
in the multidimensional case on benchmark datasets, also in terms of the error
rate.Comment: 12 pages, 2 figures, 1 table. The Fourteenth International Symposium
on Intelligent Data Analysis (2015), Saint-Etienne, Franc
Generalizing with perceptrons in case of structured phase- and pattern-spaces
We investigate the influence of different kinds of structure on the learning
behaviour of a perceptron performing a classification task defined by a teacher
rule. The underlying pattern distribution is permitted to have spatial
correlations. The prior distribution for the teacher coupling vectors itself is
assumed to be nonuniform. Thus classification tasks of quite different
difficulty are included. As learning algorithms we discuss Hebbian learning,
Gibbs learning, and Bayesian learning with different priors, using methods from
statistics and the replica formalism. We find that the Hebb rule is quite
sensitive to the structure of the actual learning problem, failing
asymptotically in most cases. Contrarily, the behaviour of the more
sophisticated methods of Gibbs and Bayes learning is influenced by the spatial
correlations only in an intermediate regime of , where
specifies the size of the training set. Concerning the Bayesian case we show,
how enhanced prior knowledge improves the performance.Comment: LaTeX, 32 pages with eps-figs, accepted by J Phys
- …
