1,201 research outputs found

    Natural selection reduced diversity on human Y chromosomes

    Get PDF
    The human Y chromosome exhibits surprisingly low levels of genetic diversity. This could result from neutral processes if the effective population size of males is reduced relative to females due to a higher variance in the number of offspring from males than from females. Alternatively, selection acting on new mutations, and affecting linked neutral sites, could reduce variability on the Y chromosome. Here, using genome-wide analyses of X, Y, autosomal and mitochondrial DNA, in combination with extensive population genetic simulations, we show that low observed Y chromosome variability is not consistent with a purely neutral model. Instead, we show that models of purifying selection are consistent with observed Y diversity. Further, the number of sites estimated to be under purifying selection greatly exceeds the number of Y-linked coding sites, suggesting the importance of the highly repetitive ampliconic regions. While we show that purifying selection removing deleterious mutations can explain the low diversity on the Y chromosome, we cannot exclude the possibility that positive selection acting on beneficial mutations could have also reduced diversity in linked neutral regions, and may have contributed to lowering human Y chromosome diversity. Because the functional significance of the ampliconic regions is poorly understood, our findings should motivate future research in this area.Comment: 43 pages, 11 figure

    Lab Retriever: a software tool for calculating likelihood ratios incorporating a probability of drop-out for forensic DNA profiles.

    Get PDF
    BackgroundTechnological advances have enabled the analysis of very small amounts of DNA in forensic cases. However, the DNA profiles from such evidence are frequently incomplete and can contain contributions from multiple individuals. The complexity of such samples confounds the assessment of the statistical weight of such evidence. One approach to account for this uncertainty is to use a likelihood ratio framework to compare the probability of the evidence profile under different scenarios. While researchers favor the likelihood ratio framework, few open-source software solutions with a graphical user interface implementing these calculations are available for practicing forensic scientists.ResultsTo address this need, we developed Lab Retriever, an open-source, freely available program that forensic scientists can use to calculate likelihood ratios for complex DNA profiles. Lab Retriever adds a graphical user interface, written primarily in JavaScript, on top of a C++ implementation of the previously published R code of Balding. We redesigned parts of the original Balding algorithm to improve computational speed. In addition to incorporating a probability of allelic drop-out and other critical parameters, Lab Retriever computes likelihood ratios for hypotheses that can include up to four unknown contributors to a mixed sample. These computations are completed nearly instantaneously on a modern PC or Mac computer.ConclusionsLab Retriever provides a practical software solution to forensic scientists who wish to assess the statistical weight of evidence for complex DNA profiles. Executable versions of the program are freely available for Mac OSX and Windows operating systems

    Engineering Synthetic TAL Effectors with Orthogonal Target Sites

    Get PDF
    The ability to engineer biological circuits that process and respond to complex cellular signals has the potential to impact many areas of biology and medicine. Transcriptional activator-like effectors (TALEs) have emerged as an attractive component for engineering these circuits, as TALEs can be designed de novo to target a given DNA sequence. Currently, however, the use of TALEs is limited by degeneracy in the site-specific manner by which they recognize DNA. Here, we propose an algorithm to computationally address this problem. We apply our algorithm to design 180 TALEs targeting 20 bp cognate binding sites that are at least 3 nt mismatches away from all 20 bp sequences in putative 2 kb human promoter regions. We generated eight of these synthetic TALE activators and showed that each is able to activate transcription from a targeted reporter. Importantly, we show that these proteins do not activate synthetic reporters containing mismatches similar to those present in the genome nor a set of endogenous genes predicted to be the most likely targets in vivo. Finally, we generated and characterized TALE repressors comprised of our orthogonal DNA binding domains and further combined them with shRNAs to accomplish near complete repression of target gene expression

    Gene expression drives the evolution of dominance.

    Get PDF
    Dominance is a fundamental concept in molecular genetics and has implications for understanding patterns of genetic variation, evolution, and complex traits. However, despite its importance, the degree of dominance in natural populations is poorly quantified. Here, we leverage multiple mating systems in natural populations of Arabidopsis to co-estimate the distribution of fitness effects and dominance coefficients of new amino acid changing mutations. We find that more deleterious mutations are more likely to be recessive than less deleterious mutations. Further, this pattern holds across gene categories, but varies with the connectivity and expression patterns of genes. Our work argues that dominance arises as a consequence of the functional importance of genes and their optimal expression levels

    The impact of population demography and selection on the genetic architecture of complex traits

    Full text link
    Population genetic studies have found evidence for dramatic population growth in recent human history. It is unclear how this recent population growth, combined with the effects of negative natural selection, has affected patterns of deleterious variation, as well as the number, frequencies, and effect sizes of mutations that contribute risk to complex traits. Here I use simulations under population genetic models where a proportion of the heritability of the trait is accounted for by mutations in a subset of the exome. I show that recent population growth increases the proportion of nonsynonymous variants segregating in the population, but does not affect the genetic load relative to that in a population that did not expand. Under a model where a mutation's effect on a trait is correlated with its effect on fitness, rare variants explain a greater portion of the additive genetic variance of the trait in a population that has recently expanded than in a population that did not recently expand. Further, when using a single-marker test, for a given false-positive rate and sample size, recent population growth decreases the expected number of significant association with the trait relative to the number detected in a population that did not expand. However, in a model where there is no correlation between a mutation's effect on fitness and the effect on the trait, common variants account for much of the additive genetic variance, regardless of demography. Moreover, here demography does not affect the number of significant association detected. These finding suggest recent population history may be an important factor influencing the power of association tests in accounting for the missing heritability of certain complex traits

    A novel approach to simulate gene-environment interactions in complex diseases

    Get PDF
    Background: Complex diseases are multifactorial traits caused by both genetic and environmental factors. They represent the major part of human diseases and include those with largest prevalence and mortality (cancer, heart disease, obesity, etc.). Despite a large amount of information that has been collected about both genetic and environmental risk factors, there are few examples of studies on their interactions in epidemiological literature. One reason can be the incomplete knowledge of the power of statistical methods designed to search for risk factors and their interactions in these data sets. An improvement in this direction would lead to a better understanding and description of gene-environment interactions. To this aim, a possible strategy is to challenge the different statistical methods against data sets where the underlying phenomenon is completely known and fully controllable, for example simulated ones. Results: We present a mathematical approach that models gene-environment interactions. By this method it is possible to generate simulated populations having gene-environment interactions of any form, involving any number of genetic and environmental factors and also allowing non-linear interactions as epistasis. In particular, we implemented a simple version of this model in a Gene-Environment iNteraction Simulator (GENS), a tool designed to simulate case-control data sets where a one gene-one environment interaction influences the disease risk. The main aim has been to allow the input of population characteristics by using standard epidemiological measures and to implement constraints to make the simulator behaviour biologically meaningful. Conclusions: By the multi-logistic model implemented in GENS it is possible to simulate case-control samples of complex disease where gene-environment interactions influence the disease risk. The user has full control of the main characteristics of the simulated population and a Monte Carlo process allows random variability. A knowledge-based approach reduces the complexity of the mathematical model by using reasonable biological constraints and makes the simulation more understandable in biological terms. Simulated data sets can be used for the assessment of novel statistical methods or for the evaluation of the statistical power when designing a study
    corecore