613 research outputs found

    Variable selection for model-based clustering using the integrated complete-data likelihood

    Full text link
    Variable selection in cluster analysis is important yet challenging. It can be achieved by regularization methods, which realize a trade-off between the clustering accuracy and the number of selected variables by using a lasso-type penalty. However, the calibration of the penalty term can suffer from criticisms. Model selection methods are an efficient alternative, yet they require a difficult optimization of an information criterion which involves combinatorial problems. First, most of these optimization algorithms are based on a suboptimal procedure (e.g. stepwise method). Second, the algorithms are often greedy because they need multiple calls of EM algorithms. Here we propose to use a new information criterion based on the integrated complete-data likelihood. It does not require any estimate and its maximization is simple and computationally efficient. The original contribution of our approach is to perform the model selection without requiring any parameter estimation. Then, parameter inference is needed only for the unique selected model. This approach is used for the variable selection of a Gaussian mixture model with conditional independence assumption. The numerical experiments on simulated and benchmark datasets show that the proposed method often outperforms two classical approaches for variable selection.Comment: submitted to Statistics and Computin

    Efficient learning in Approximate Bayesian Computation

    Get PDF
    Efficient learning in Approximate Bayesian Computatio

    Bayesian model selection in logistic regression for the detection of adverse drug reactions

    Full text link
    Motivation: Spontaneous adverse event reports have a high potential for detecting adverse drug reactions. However, due to their dimension, exploring such databases requires statistical methods. In this context, disproportionality measures are used. However, by projecting the data onto contingency tables, these methods become sensitive to the problem of co-prescriptions and masking effects. Recently, logistic regressions have been used with a Lasso type penalty to perform the detection of associations between drugs and adverse events. However, the choice of the penalty value is open to criticism while it strongly influences the results. Results: In this paper, we propose to use a logistic regression whose sparsity is viewed as a model selection challenge. Since the model space is huge, a Metropolis-Hastings algorithm carries out the model selection by maximizing the BIC criterion. Thus, we avoid the calibration of penalty or threshold. During our application on the French pharmacovigilance database, the proposed method is compared to well established approaches on a reference data set, and obtains better rates of positive and negative controls. However, many signals are not detected by the proposed method. So, we conclude that this method should be used in parallel to existing measures in pharmacovigilance.Comment: 7 pages, 3 figures, submitted to Biometrical Journa

    Efficient learning in ABC algorithms

    Full text link
    Approximate Bayesian Computation has been successfully used in population genetics to bypass the calculation of the likelihood. These methods provide accurate estimates of the posterior distribution by comparing the observed dataset to a sample of datasets simulated from the model. Although parallelization is easily achieved, computation times for ensuring a suitable approximation quality of the posterior distribution are still high. To alleviate the computational burden, we propose an adaptive, sequential algorithm that runs faster than other ABC algorithms but maintains accuracy of the approximation. This proposal relies on the sequential Monte Carlo sampler of Del Moral et al. (2012) but is calibrated to reduce the number of simulations from the model. The paper concludes with numerical experiments on a toy example and on a population genetic study of Apis mellifera, where our algorithm was shown to be faster than traditional ABC schemes

    Uniformly Lipschitzian mappings in modular function spaces

    Get PDF
    Let ρ be a convex modular function satisfying a ∆2-type condition and Lρ the corresponding modular space. Assume that C is a ρ-bounded and ρ-a.e compact subset of Lρ and T : C → C is a k-uniformly Lipschitzian mapping. We prove that T has a fixed point if k < (Ñ(Lρ))−1/2 where Ñ(Lρ) is a geometrical coefficient of normal structure. We also show that Ñ(Lρ) < 1 in modular Orlicz spaces for uniformly convex Orlicz functions.Dirección General de Investigación Científica y TécnicaPlan Andaluz de Investigación (Junta de Andalucía

    Asymptotically regular mappings in modular function spaces

    Get PDF
    Let ρ be a modular function satisfying a ∆2-type condition and Lρ the corresponding modular space. The main result in this paper states that if C is a ρ-bounded and ρ-a.e sequentially compact subset of Lρ and T : C → C is an asymptotically regular mapping such that lim inf n→∞ [Tn] < 2, where |S| denotes the Lipschitz constant of S, then T has a fixed point. We show that the estimate lim inf n→∞ [Tn] < 2 cannot be, in general, improved.Plan Andaluz de Investigación (Junta de Andalucía
    corecore