Search CORE

613 research outputs found

Variable selection for model-based clustering using the integrated complete-data likelihood

Author: Matthieu Marbac
Mohammed Sedki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/01/2015
Field of study

Variable selection in cluster analysis is important yet challenging. It can be achieved by regularization methods, which realize a trade-off between the clustering accuracy and the number of selected variables by using a lasso-type penalty. However, the calibration of the penalty term can suffer from criticisms. Model selection methods are an efficient alternative, yet they require a difficult optimization of an information criterion which involves combinatorial problems. First, most of these optimization algorithms are based on a suboptimal procedure (e.g. stepwise method). Second, the algorithms are often greedy because they need multiple calls of EM algorithms. Here we propose to use a new information criterion based on the integrated complete-data likelihood. It does not require any estimate and its maximization is simple and computationally efficient. The original contribution of our approach is to perform the model selection without requiring any parameter estimation. Then, parameter inference is needed only for the unique selected model. This approach is used for the variable selection of a Gaussian mixture model with conditional independence assumption. The numerical experiments on simulated and benchmark datasets show that the proposed method often outperforms two classical approaches for variable selection.Comment: submitted to Statistics and Computin

arXiv.org e-Print Archive

Crossref

HAL Descartes

HAL UVSQ

Efficient learning in Approximate Bayesian Computation

Author: Mohammed Sedki
Pierre Pudlo
Publication venue
Publication date: 13/05/2011
Field of study

Efficient learning in Approximate Bayesian Computatio

Nature Precedings

Bayesian model selection in logistic regression for the detection of adverse drug reactions

Author: Marbac Matthieu
Sedki Mohammed
Tubert-Bitter Pascale
Publication venue
Publication date: 08/04/2015
Field of study

Motivation: Spontaneous adverse event reports have a high potential for detecting adverse drug reactions. However, due to their dimension, exploring such databases requires statistical methods. In this context, disproportionality measures are used. However, by projecting the data onto contingency tables, these methods become sensitive to the problem of co-prescriptions and masking effects. Recently, logistic regressions have been used with a Lasso type penalty to perform the detection of associations between drugs and adverse events. However, the choice of the penalty value is open to criticism while it strongly influences the results. Results: In this paper, we propose to use a logistic regression whose sparsity is viewed as a model selection challenge. Since the model space is huge, a Metropolis-Hastings algorithm carries out the model selection by maximizing the BIC criterion. Thus, we avoid the calibration of penalty or threshold. During our application on the French pharmacovigilance database, the proposed method is compared to well established approaches on a reference data set, and obtains better rates of positive and negative controls. However, many signals are not detected by the proposed method. So, we conclude that this method should be used in parallel to existing measures in pharmacovigilance.Comment: 7 pages, 3 figures, submitted to Biometrical Journa

arXiv.org e-Print Archive

Crossref

HAL: Hyper Article en Ligne

HAL-Pasteur

HAL UVSQ

Efficient learning in ABC algorithms

Author: Cornuet Jean-Marie
Marin Jean-Michel
Pudlo Pierre
Robert Christian P.
Sedki Mohammed
Publication venue
Publication date: 01/01/2012
Field of study

Approximate Bayesian Computation has been successfully used in population genetics to bypass the calculation of the likelihood. These methods provide accurate estimates of the posterior distribution by comparing the observed dataset to a sample of datasets simulated from the model. Although parallelization is easily achieved, computation times for ensuring a suitable approximation quality of the posterior distribution are still high. To alleviate the computational burden, we propose an adaptive, sequential algorithm that runs faster than other ABC algorithms but maintains accuracy of the approximation. This proposal relies on the sequential Monte Carlo sampler of Del Moral et al. (2012) but is calibrated to reduce the number of simulations from the model. The paper concludes with numerical experiments on a toy example and on a population genetic study of Apis mellifera, where our algorithm was shown to be faster than traditional ABC schemes

arXiv.org e-Print Archive

CiteSeerX

Base de publications de l'université Paris-Dauphine

HAL Descartes

HAL-CIRAD

HAL-Polytechnique

Uniformly Lipschitzian mappings in modular function spaces

Author: Domínguez Benavides Tomás
Khamsi Mohamed Amine
Samadi Sedki
Publication venue: 'Elsevier BV'
Publication date: 01/10/2001
Field of study

Let ρ be a convex modular function satisfying a ∆2-type condition and Lρ the corresponding modular space. Assume that C is a ρ-bounded and ρ-a.e compact subset of Lρ and T : C → C is a k-uniformly Lipschitzian mapping. We prove that T has a fixed point if k < (Ñ(Lρ))−1/2 where Ñ(Lρ) is a geometrical coefficient of normal structure. We also show that Ñ(Lρ) < 1 in modular Orlicz spaces for uniformly convex Orlicz functions.Dirección General de Investigación Científica y TécnicaPlan Andaluz de Investigación (Junta de Andalucía

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

Asymptotically regular mappings in modular function spaces

Author: Domínguez Benavides Tomás
Khamsi Mohamed Amine
Samadi Sedki
Publication venue: Japanese Association of Mathematical Sciences
Publication date: 01/01/2001
Field of study

Let ρ be a modular function satisfying a ∆2-type condition and Lρ the corresponding modular space. The main result in this paper states that if C is a ρ-bounded and ρ-a.e sequentially compact subset of Lρ and T : C → C is an asymptotically regular mapping such that lim inf n→∞ [Tn] < 2, where |S| denotes the Lipschitz constant of S, then T has a fixed point. We show that the estimate lim inf n→∞ [Tn] < 2 cannot be, in general, improved.Plan Andaluz de Investigación (Junta de Andalucía

idUS. Depósito de Investigación Universidad de Sevilla