9,050 research outputs found

    Removing the influence of a group variable in high-dimensional predictive modelling

    Full text link
    In many application areas, predictive models are used to support or make important decisions. There is increasing awareness that these models may contain spurious or otherwise undesirable correlations. Such correlations may arise from a variety of sources, including batch effects, systematic measurement errors, or sampling bias. Without explicit adjustment, machine learning algorithms trained using these data can produce poor out-of-sample predictions which propagate these undesirable correlations. We propose a method to pre-process the training data, producing an adjusted dataset that is statistically independent of the nuisance variables with minimum information loss. We develop a conceptually simple approach for creating an adjusted dataset in high-dimensional settings based on a constrained form of matrix decomposition. The resulting dataset can then be used in any predictive algorithm with the guarantee that predictions will be statistically independent of the group variable. We develop a scalable algorithm for implementing the method, along with theory support in the form of independence guarantees and optimality. The method is illustrated on some simulation examples and applied to two case studies: removing machine-specific correlations from brain scan data, and removing race and ethnicity information from a dataset used to predict recidivism. That the motivation for removing undesirable correlations is quite different in the two applications illustrates the broad applicability of our approach.Comment: Update. 18 pages, 3 figure

    Exact Static Cylindrical Solution to Conformal Weyl Gravity

    Get PDF
    We present the exact exterior solution for a static and neutral cylindrically symmetric source in locally conformal invariant Weyl gravity. As a special case the general relativity analogue still can be attained, however only as a sub-family of solutions. Our solution contains a linear term that would thus result in a potential that grows linearly over large distances. This may have implications for exotic astrophysical structures as well as matter fields on the extremely small scale.Comment: 8 pages, Physical Review

    Collective Dynamics of Random Polyampholytes

    Full text link
    We consider the Langevin dynamics of a semi-dilute system of chains which are random polyampholytes of average monomer charge qq and with a fluctuations in this charge of the size Q1Q^{-1} and with freely floating counter-ions in the surrounding. We cast the dynamics into the functional integral formalism and average over the quenched charge distribution in order to compute the dynamic structure factor and the effective collective potential matrix. The results are given for small charge fluctuations. In the limit of finite qq we then find that the scattering approaches the limit of polyelectrolyte solutions.Comment: 13 pages including 6 figures, submitted J. Chem. Phy

    Automated code generation for discontinuous Galerkin methods

    Full text link
    A compiler approach for generating low-level computer code from high-level input for discontinuous Galerkin finite element forms is presented. The input language mirrors conventional mathematical notation, and the compiler generates efficient code in a standard programming language. This facilitates the rapid generation of efficient code for general equations in varying spatial dimensions. Key concepts underlying the compiler approach and the automated generation of computer code are elaborated. The approach is demonstrated for a range of common problems, including the Poisson, biharmonic, advection--diffusion and Stokes equations

    The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

    Get PDF
    Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets

    Probiotics for preventing acute otitis media in children

    Get PDF
    This is a protocol for a Cochrane Review (Intervention). The objectives are as follows: to assess the effects of probiotics to prevent the occurrence and reduce the severity of acute otitis media in children.</p
    corecore