617 research outputs found

    Genetic Classification of Populations using Supervised Learning

    Get PDF
    There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case--control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed \emph{unsupervised}. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available. In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into pre-defined populations, particularly in quality control for large scale genome wide association studies.Comment: Accepted PLOS On

    Last passage percolation and traveling fronts

    Get PDF
    We consider a system of N particles with a stochastic dynamics introduced by Brunet and Derrida. The particles can be interpreted as last passage times in directed percolation on {1,...,N} of mean-field type. The particles remain grouped and move like a traveling wave, subject to discretization and driven by a random noise. As N increases, we obtain estimates for the speed of the front and its profile, for different laws of the driving noise. The Gumbel distribution plays a central role for the particle jumps, and we show that the scaling limit is a L\'evy process in this case. The case of bounded jumps yields a completely different behavior

    Learning a Factor Model via Regularized PCA

    Full text link
    We consider the problem of learning a linear factor model. We propose a regularized form of principal component analysis (PCA) and demonstrate through experiments with synthetic and real data the superiority of resulting estimates to those produced by pre-existing factor analysis approaches. We also establish theoretical results that explain how our algorithm corrects the biases induced by conventional approaches. An important feature of our algorithm is that its computational requirements are similar to those of PCA, which enjoys wide use in large part due to its efficiency

    Growing interfaces uncover universal fluctuations behind scale invariance

    Get PDF
    Stochastic motion of a point -- known as Brownian motion -- has many successful applications in science, thanks to its scale invariance and consequent universal features such as Gaussian fluctuations. In contrast, the stochastic motion of a line, though it is also scale-invariant and arises in nature as various types of interface growth, is far less understood. The two major missing ingredients are: an experiment that allows a quantitative comparison with theory and an analytic solution of the Kardar-Parisi-Zhang (KPZ) equation, a prototypical equation for describing growing interfaces. Here we solve both problems, showing unprecedented universality beyond the scaling laws. We investigate growing interfaces of liquid-crystal turbulence and find not only universal scaling, but universal distributions of interface positions. They obey the largest-eigenvalue distributions of random matrices and depend on whether the interface is curved or flat, albeit universal in each case. Our exact solution of the KPZ equation provides theoretical explanations.Comment: 5 pages, 3 figures, supplementary information available on Journal pag

    Functional Renormalization Group and the Field Theory of Disordered Elastic Systems

    Full text link
    We study elastic systems such as interfaces or lattices, pinned by quenched disorder. To escape triviality as a result of ``dimensional reduction'', we use the functional renormalization group. Difficulties arise in the calculation of the renormalization group functions beyond 1-loop order. Even worse, observables such as the 2-point correlation function exhibit the same problem already at 1-loop order. These difficulties are due to the non-analyticity of the renormalized disorder correlator at zero temperature, which is inherent to the physics beyond the Larkin length, characterized by many metastable states. As a result, 2-loop diagrams, which involve derivatives of the disorder correlator at the non-analytic point, are naively "ambiguous''. We examine several routes out of this dilemma, which lead to a unique renormalizable field-theory at 2-loop order. It is also the only theory consistent with the potentiality of the problem. The beta-function differs from previous work and the one at depinning by novel "anomalous terms''. For interfaces and random bond disorder we find a roughness exponent zeta = 0.20829804 epsilon + 0.006858 epsilon^2, epsilon = 4-d. For random field disorder we find zeta = epsilon/3 and compute universal amplitudes to order epsilon^2. For periodic systems we evaluate the universal amplitude of the 2-point function. We also clarify the dependence of universal amplitudes on the boundary conditions at large scale. All predictions are in good agreement with numerical and exact results, and an improvement over one loop. Finally we calculate higher correlation functions, which turn out to be equivalent to those at depinning to leading order in epsilon.Comment: 42 pages, 41 figure

    A pedestrian's view on interacting particle systems, KPZ universality, and random matrices

    Full text link
    These notes are based on lectures delivered by the authors at a Langeoog seminar of SFB/TR12 "Symmetries and universality in mesoscopic systems" to a mixed audience of mathematicians and theoretical physicists. After a brief outline of the basic physical concepts of equilibrium and nonequilibrium states, the one-dimensional simple exclusion process is introduced as a paradigmatic nonequilibrium interacting particle system. The stationary measure on the ring is derived and the idea of the hydrodynamic limit is sketched. We then introduce the phenomenological Kardar-Parisi-Zhang (KPZ) equation and explain the associated universality conjecture for surface fluctuations in growth models. This is followed by a detailed exposition of a seminal paper of Johansson that relates the current fluctuations of the totally asymmetric simple exclusion process (TASEP) to the Tracy-Widom distribution of random matrix theory. The implications of this result are discussed within the framework of the KPZ conjecture.Comment: 52 pages, 4 figures; to appear in J. Phys. A: Math. Theo

    From Quantum Systems to L-Functions: Pair Correlation Statistics and Beyond

    Full text link
    The discovery of connections between the distribution of energy levels of heavy nuclei and spacings between prime numbers has been one of the most surprising and fruitful observations in the twentieth century. The connection between the two areas was first observed through Montgomery's work on the pair correlation of zeros of the Riemann zeta function. As its generalizations and consequences have motivated much of the following work, and to this day remains one of the most important outstanding conjectures in the field, it occupies a central role in our discussion below. We describe some of the many techniques and results from the past sixty years, especially the important roles played by numerical and experimental investigations, that led to the discovery of the connections and progress towards understanding the behaviors. In our survey of these two areas, we describe the common mathematics that explains the remarkable universality. We conclude with some thoughts on what might lie ahead in the pair correlation of zeros of the zeta function, and other similar quantities.Comment: Version 1.1, 50 pages, 6 figures. To appear in "Open Problems in Mathematics", Editors John Nash and Michael Th. Rassias. arXiv admin note: text overlap with arXiv:0909.491

    4D Imaging and Diffraction Dynamics of Single-Particle Phase Transition in Heterogeneous Ensembles

    Get PDF
    In this Letter, we introduce conical-scanning dark-field imaging in four-dimensional (4D) ultrafast electron microscopy to visualize single-particle dynamics of a polycrystalline ensemble undergoing phase transitions. Specifically, the ultrafast metal–insulator phase transition of vanadium dioxide is induced using laser excitation and followed by taking electron-pulsed, time-resolved images and diffraction patterns. The single-particle selectivity is achieved by identifying the origin of all constituent Bragg spots on Debye–Scherrer rings from the ensemble. Orientation mapping and dynamic scattering simulation of the electron diffraction patterns in the monoclinic and tetragonal phase during the transition confirm the observed behavior of Bragg spots change with time. We found that the threshold temperature for phase recovery increases with increasing particle sizes and we quantified the observation through a theoretical model developed for single-particle phase transitions. The reported methodology of conical scanning, orientation mapping in 4D imaging promises to be powerful for heterogeneous ensemble, as it enables imaging and diffraction at a given time with a full archive of structural information for each particle, for example, size, morphology, and orientation while minimizing radiation damage to the specimen

    Functional Myogenic Engraftment from Mouse iPS Cells

    Get PDF
    Direct reprogramming of adult fibroblasts to a pluripotent state has opened new possibilities for the generation of patient- and disease-specific stem cells. However the ability of induced pluripotent stem (iPS) cells to generate tissue that mediates functional repair has been demonstrated in very few animal models of disease to date. Here we present the proof of principle that iPS cells may be used effectively for the treatment of muscle disorders. We combine the generation of iPS cells with conditional expression of Pax7, a robust approach to derive myogenic progenitors. Transplantation of Pax7-induced iPS-derived myogenic progenitors into dystrophic mice results in extensive engraftment, which is accompanied by improved contractility of treated muscles. These findings demonstrate the myogenic regenerative potential of iPS cells and provide rationale for their future therapeutic application for muscular dystrophies
    corecore