339 research outputs found

    Practical targeted learning from large data sets by survey sampling

    Get PDF
    We address the practical construction of asymptotic confidence intervals for smooth (i.e., path-wise differentiable), real-valued statistical parameters by targeted learning from independent and identically distributed data in contexts where sample size is so large that it poses computational challenges. We observe some summary measure of all data and select a sub-sample from the complete data set by Poisson rejective sampling with unequal inclusion probabilities based on the summary measures. Targeted learning is carried out from the easier to handle sub-sample. We derive a central limit theorem for the targeted minimum loss estimator (TMLE) which enables the construction of the confidence intervals. The inclusion probabilities can be optimized to reduce the asymptotic variance of the TMLE. We illustrate the procedure with two examples where the parameters of interest are variable importance measures of an exposure (binary or continuous) on an outcome. We also conduct a simulation study and comment on its results. keywords: semiparametric inference; survey sampling; targeted minimum loss estimation (TMLE

    The Effects of a "Fat Tax" on the Nutrient Intake of French Households

    Get PDF
    This article assesses the effects of a "fat tax" on the nutrient intake of French households across different income groups using a method that estimates the nutrient elasticities of French households. We estimate a complete demand system by aggregating an individual demand system over cohorts. The use of a cohort model is justified by the incompleteness of our data. We find that a "fat tax" would have ambiguous and extremely small effects on the nutrient intake of French households, and its associated economic welfare costs would be similarly weak.Household survey data, demand system, nutrient elasticities., Food Consumption/Nutrition/Food Safety,

    Les faibles effets d’une « fat tax » sur les achats alimentaires des ménages français : une approche par les nutriments

    Get PDF
    L’Organisation Mondiale de la Santé considère le surpoids et l’obésité comme un des problèmes majeurs de santé publique dans le monde. En France, selon l’Enquête Individuelle et Nationale sur les Consommations Alimentaires (INCA2 2006-2007), 38,9 % des hommes et 24,2 % des femmes adultes sont en surpoids et 11,6 % des adultes hommes et femmes sont obèses. En 2002, le coût médical de l’obésité est estimé entre 1,5 et 4,6 % des dépenses de santé selon l’Institut de recherche et documentation en économie de la santé (IRDES). Le développement de l’obésité et ses répercussions économiques ont conduit les pouvoirs publics à s’interroger sur les mesures susceptibles de modifier les comportements de consommation alimentaire. En estimant un système de demande, on évalue l’influence et la pertinence d’une politique de taxation des aliments à fortes teneurs en calories, en graisses et en sucres, dénommée « fat tax ». Nous montrons que les effets de cette « fat tax » sur les achats d’aliments et sur les achats de calories et de nutriments qui en résultent sont faibles. Son influence sur le poids des individus à court terme est également faible, mais tend à augmenter dans le long terme. Enfin, si la « fat tax » génère d'importantes recettes fiscales, elle affecte plus les ménages modestes.

    Pivotal estimation in high-dimensional regression via linear programming

    Full text link
    We propose a new method of estimation in high-dimensional linear regression model. It allows for very weak distributional assumptions including heteroscedasticity, and does not require the knowledge of the variance of random errors. The method is based on linear programming only, so that its numerical implementation is faster than for previously known techniques using conic programs, and it allows one to deal with higher dimensional models. We provide upper bounds for estimation and prediction errors of the proposed estimator showing that it achieves the same rate as in the more restrictive situation of fixed design and i.i.d. Gaussian errors with known variance. Following Gautier and Tsybakov (2011), we obtain the results under weaker sensitivity assumptions than the restricted eigenvalue or assimilated conditions

    Bootstrapping Quasi Likelihood Ratio Tests under Misspecification

    Get PDF
    We consider quasi likelihood ratio (QLR) tests for restrictions on parameters under potential model misspecification. For convex M-estimation, including quantile regression, we propose a general and simple nonparametric bootstrap procedure that yields asymptotically valid critical values. The method modifies the bootstrap objective function to mimic what happens under the null hypothesis. When testing for an univariate restriction, we show how the test statistic can be made asymptotically pivotal. Our bootstrap can then provide asymptotic refinements as illustrated for a linear regression model. A Monte-Carlo study and an empirical application illustrate that double bootstrap of the QLR test controls level well and is powerful

    Detection of fast radio transients with multiple stations: a case study using the Very Long Baseline Array

    Full text link
    Recent investigations reveal an important new class of transient radio phenomena that occur on sub-millisecond timescales. Often transient surveys' data volumes are too large to archive exhaustively. Instead, an on-line automatic system must excise impulsive interference and detect candidate events in real-time. This work presents a case study using data from multiple geographically distributed stations to perform simultaneous interference excision and transient detection. We present several algorithms that incorporate dedispersed data from multiple sites, and report experiments with a commensal real-time transient detection system on the Very Long Baseline Array (VLBA). We test the system using observations of pulsar B0329+54. The multiple-station algorithms enhanced sensitivity for detection of individual pulses. These strategies could improve detection performance for a future generation of geographically distributed arrays such as the Australian Square Kilometre Array Pathfinder and the Square Kilometre Array.Comment: 12 pages, 14 figures. Accepted for Ap

    Consistency of the Frequency Domain Bootstrap for differentiable functionals

    Get PDF
    In this paper consistency of the Frequency Domain Bootstrap for differentiable functionals of spectral density function of a linear stationary time series is discussed. The notion of influence function in the time domain on spectral measures is introduced. Moreover, the Fréchet differen-tiability of functionals of spectral measures is defined. Sufficient and necessary conditions for consistency of the FDB in the considered problems are provided and the second order correctness is discussed for some functionals. Finally, validity of the FDB for the empirical processes is considered

    Generalized Empirical Likelihood M Testing for Semiparametric Models with Time Series Data

    Get PDF
    The problem of testing for the correct specification of semiparametric models with time series data is considered. Two general classes of M test statistics that are based on the generalized empirical likelihood method are proposed. A test for omitted covariates in a semiparametric time series regression model is then used to showcase the results. Monte Carlo experiments show that the tests have reasonable size and power properties in finite samples. An application to the demand of electricity in Ontario (Canada) illustrates their usefulness in practice
    corecore