1,480 research outputs found

    Robust estimation of the vector autoregressive model by a least trimmed squares procedure.

    Get PDF
    The vector autoregressive model is very popular for modeling multiple time series. Estimation of its parameters is typically done by a least squares procedure. However, this estimation method is unreliable when outliers are present in the data, and therefore we propose to estimate the vector autoregressive model by using a multivariate least trimmed squares estimator. We also show how the order of the autoregressive model can be determined in a robust way. The robust procedure is illustrated on a real data set.Robustness; Multivariate time series; Outliers; Trimming; Vector autoregressive models;

    Influence of observations on the misclassification probability in quadratic discriminant analysis.

    Get PDF
    In this paper it is studied how observations in the training sample affect the misclassification probability of a quadratic discriminant rule. An approach based on partial influence functions is followed. It allows to quantify the effect of observations in the training sample on the performance of the associated classification rule. Focus is on the effect of outliers on the misclassification rate, merely than on the estimates of the parameters of the quadratic discriminant rule. The expression for the partial influence function is then used to construct a diagnostic tool for detecting influential observations. Applications on real data sets are provided.Applications; Classification; Data; Diagnostics; Discriminant analysis; Estimator; Functions; Influence function; Misclassification probability; Outliers; Parameters; Partial influence functions; Performance; Principal components; Probability; Quadratic discriminant analysis; Tool; Training;

    Unbiased Tail Estimation by an Extension of the Generalized Pareto Distribution

    Get PDF
    AMS classifications: 62G20; 62G32;bias;exchange rate;heavy tails;peaks-over-threshold;regular variation;tail index

    Comparison of five assays for DNA extraction from bacterial cells in human faecal samples

    Get PDF
    Aim To determine the most effective DNA extraction method for bacteria in faecal samples. Materials and Results This study assessed five commercial methods, that is, NucliSens easyMag, QIAamp DNA Stool Mini kit, PureLink Microbiome DNA purification kit, QIAamp PowerFecal DNA kit and RNeasy PowerMicrobiome kit, of which the latter has been optimized for DNA extraction. The DNA quantity and quality were determined using Nanodrop, Qubit and qPCR. The PowerMicrobiome kit recovered the highest DNA concentration, whereby this kit also recovered the highest gene copy number of Gram positives, Gram negatives and total bacteria. Furthermore, the PowerMicrobiome kit in combination with mechanical pre-treatment (bead beating) and with combined enzymatic and mechanical pre-treatment (proteinase K+mutanolysin+bead beating) was more effective than without pre-treatment. Conclusion From the five DNA extraction methods that were compared, the PowerMicrobiome kit, preceded by bead beating, which is standard included, was found to be the most effective DNA extraction method for bacteria in faecal samples. Significance and Impact of the Study The quantity and quality of DNA extracted from human faecal samples is a first important step to optimize molecular methods. Here we have shown that the PowerMicrobiome kit is an effective DNA extraction method for bacterial cells in faecal samples for downstream qPCR purpose

    Trimmed bagging.

    Get PDF
    Bagging has been found to be successful in increasing the predictive performance of unstable classifiers. Bagging draws bootstrap samples from the training sample, applies the classifier to each bootstrap sample, and then averages over all obtained classification rules. The idea of trimmed bagging is to exclude the bootstrapped classification rules that yield the highest error rates, as estimated by the out-of-bag error rate, and to aggregate over the remaining ones. In this note we explore the potential benefits of trimmed bagging. On the basis of numerical experiments, we conclude that trimmed bagging performs comparably to standard bagging when applied to unstable classifiers as decision trees, but yields better results when applied to more stable base classifiers, like support vector machines.

    Classification efficiencies for robust linear discriminant analysis.

    Get PDF
    Linear discriminant analysis is typically carried out using Fisher’s method. This method relies on the sample averages and covariance matrices computed from the different groups constituting the training sample. Since sample averages and covariance matrices are not robust, it has been proposed to use robust estimators of location and covariance instead, yielding a robust version of Fisher’s method. In this paper relative classification efficiencies of the robust procedures with respect to the classical method are computed. Second order influence functions appear to be useful for computing these classification efficiencies. It turns out that, when using an appropriate robust estimator, the loss in classification efficiency at the normal model remains limited. These findings are confirmed by finite sample simulations.Classification efficiency; Discriminant analysis; Error rate; Fisher rule; Influence function; Robustness;

    Influence of observations on the misclassification probability in quadratic discriminant analysis.

    Get PDF
    In this paper it is analyzed how observations in the training sample affect the misclassification probability of a quadratic discriminant rule. An approach based on partial influence functions is followed. It allows to quantify the effect of observations in the training sample on the quality of the associated classification rule. Focus is more on the effect on the future misclassification rate, than on the influence on the parameters of the quadratic discriminant rule. The expression for the influence function is then used to construct a diagnostic tool for detecting influential observations. Applications on real data sets are provided.Applications; Classification; Data; Diagnostics; Discriminant analysis; Functions; Influence function; Misclassification probability; Outliers; Partial influence functions; Probability; Quadratic discriminant analysis; Quality; Robust covariance estimation; Robust regression; Training;

    Second-order refined peaks-over-threshold modelling for heavy-tailed distributions

    Full text link
    Modelling excesses over a high threshold using the Pareto or generalized Pareto distribution (PD/GPD) is the most popular approach in extreme value statistics. This method typically requires high thresholds in order for the (G)PD to fit well and in such a case applies only to a small upper fraction of the data. The extension of the (G)PD proposed in this paper is able to describe the excess distribution for lower thresholds in case of heavy tailed distributions. This yields a statistical model that can be fitted to a larger portion of the data. Moreover, estimates of tail parameters display stability for a larger range of thresholds. Our findings are supported by asymptotic results, simulations and a case study.Comment: to appear in the Journal of Statistical Planning and Inferenc

    Robust estimation of the vector autoregressive model by a trimmed least squares procedure.

    Get PDF
    The vector autoregressive model is very popular for modeling multiple time series. Estimation of its parameters is done by a least squares procedure. However, this estimation method is unreliable when outliers are present in the data, and there is a need for robust alternatives. In this paper we propose to estimate the vector autoregressive model by using a trimmed least squares estimator. We show how the order of the autoregressive model can be determined in a robust way, and how confidence bounds around the robustly estimated impulse response functions can be constructed. The resistance of the estimators to outliers is studied on real and simulated data.Advantages; Calibration; Data; Estimator; Least-squares; M-estimators; Methods; Model; Optimal; Outliers; Partial least squares; Precision; Prediction; Regression; Research; Robust regression; Robustness; Squares; Variables; Yield; Robust estimation; Time; Time series; Order; Functions;
    corecore