613 research outputs found
Variable selection for model-based clustering using the integrated complete-data likelihood
Variable selection in cluster analysis is important yet challenging. It can
be achieved by regularization methods, which realize a trade-off between the
clustering accuracy and the number of selected variables by using a lasso-type
penalty. However, the calibration of the penalty term can suffer from
criticisms. Model selection methods are an efficient alternative, yet they
require a difficult optimization of an information criterion which involves
combinatorial problems. First, most of these optimization algorithms are based
on a suboptimal procedure (e.g. stepwise method). Second, the algorithms are
often greedy because they need multiple calls of EM algorithms. Here we propose
to use a new information criterion based on the integrated complete-data
likelihood. It does not require any estimate and its maximization is simple and
computationally efficient. The original contribution of our approach is to
perform the model selection without requiring any parameter estimation. Then,
parameter inference is needed only for the unique selected model. This approach
is used for the variable selection of a Gaussian mixture model with conditional
independence assumption. The numerical experiments on simulated and benchmark
datasets show that the proposed method often outperforms two classical
approaches for variable selection.Comment: submitted to Statistics and Computin
Efficient learning in Approximate Bayesian Computation
Efficient learning in Approximate Bayesian Computatio
Bayesian model selection in logistic regression for the detection of adverse drug reactions
Motivation: Spontaneous adverse event reports have a high potential for
detecting adverse drug reactions. However, due to their dimension, exploring
such databases requires statistical methods. In this context,
disproportionality measures are used. However, by projecting the data onto
contingency tables, these methods become sensitive to the problem of
co-prescriptions and masking effects. Recently, logistic regressions have been
used with a Lasso type penalty to perform the detection of associations between
drugs and adverse events. However, the choice of the penalty value is open to
criticism while it strongly influences the results. Results: In this paper, we
propose to use a logistic regression whose sparsity is viewed as a model
selection challenge. Since the model space is huge, a Metropolis-Hastings
algorithm carries out the model selection by maximizing the BIC criterion.
Thus, we avoid the calibration of penalty or threshold. During our application
on the French pharmacovigilance database, the proposed method is compared to
well established approaches on a reference data set, and obtains better rates
of positive and negative controls. However, many signals are not detected by
the proposed method. So, we conclude that this method should be used in
parallel to existing measures in pharmacovigilance.Comment: 7 pages, 3 figures, submitted to Biometrical Journa
Efficient learning in ABC algorithms
Approximate Bayesian Computation has been successfully used in population
genetics to bypass the calculation of the likelihood. These methods provide
accurate estimates of the posterior distribution by comparing the observed
dataset to a sample of datasets simulated from the model. Although
parallelization is easily achieved, computation times for ensuring a suitable
approximation quality of the posterior distribution are still high. To
alleviate the computational burden, we propose an adaptive, sequential
algorithm that runs faster than other ABC algorithms but maintains accuracy of
the approximation. This proposal relies on the sequential Monte Carlo sampler
of Del Moral et al. (2012) but is calibrated to reduce the number of
simulations from the model. The paper concludes with numerical experiments on a
toy example and on a population genetic study of Apis mellifera, where our
algorithm was shown to be faster than traditional ABC schemes
Uniformly Lipschitzian mappings in modular function spaces
Let ρ be a convex modular function satisfying a ∆2-type condition and Lρ the
corresponding modular space. Assume that C is a ρ-bounded and ρ-a.e compact subset of Lρ and T : C → C is a k-uniformly Lipschitzian mapping. We prove that T has a fixed point if k < (Ñ(Lρ))−1/2 where Ñ(Lρ) is a geometrical coefficient of normal structure. We also show that Ñ(Lρ) < 1 in modular Orlicz spaces for uniformly convex Orlicz functions.Dirección General de Investigación Científica y TécnicaPlan Andaluz de Investigación (Junta de Andalucía
Asymptotically regular mappings in modular function spaces
Let ρ be a modular function satisfying a ∆2-type condition and Lρ the corresponding modular space. The main result in this paper states that if C is a ρ-bounded and ρ-a.e sequentially compact subset of Lρ and T : C → C is an asymptotically regular mapping such that lim inf n→∞ [Tn] < 2, where |S| denotes the Lipschitz constant of S, then T has a fixed point. We show that the estimate lim inf n→∞ [Tn] < 2 cannot be, in general, improved.Plan Andaluz de Investigación (Junta de Andalucía
- …
