274 research outputs found
Ancestral population genomics
The full genomes of several closely related species are now available, opening an emerging field of investigation borrowing both from population genetics and phylogenetics. Providing we can properly model sequence evolution within populations undergoing speciation events, this resource enables us to estimate key population genetics parameters, such as ancestral population sizes and split times. Furthermore, we can enhance our understanding of the recombination process and investigate various selective forces. We discuss the basic speciation models for closely related species, including the isolation and isolation-with-migration models. A major point in our discussion is that only a few complete genomes contain much information about the whole population. The reason being that recombination unlinks genomic regions, and therefore a few genomes contain many segments with distinct histories. The challenge of population genomics is to decode this mosaic of histories in order to infer scenarios of demography and selection. We survey different approaches for understanding ancestral species from analyses of genomic data from closely related species. In particular, we emphasize core assumptions and working hypothesis. Finally, we discuss computational and statistical challenges that arise in the analysis of population genomics data sets
Simulation from endpoint-conditioned, continuous-time Markov chains on a finite state space, with applications to molecular evolution
Analyses of serially-sampled data often begin with the assumption that the
observations represent discrete samples from a latent continuous-time
stochastic process. The continuous-time Markov chain (CTMC) is one such
generative model whose popularity extends to a variety of disciplines ranging
from computational finance to human genetics and genomics. A common theme among
these diverse applications is the need to simulate sample paths of a CTMC
conditional on realized data that is discretely observed. Here we present a
general solution to this sampling problem when the CTMC is defined on a
discrete and finite state space. Specifically, we consider the generation of
sample paths, including intermediate states and times of transition, from a
CTMC whose beginning and ending states are known across a time interval of
length . We first unify the literature through a discussion of the three
predominant approaches: (1) modified rejection sampling, (2) direct sampling,
and (3) uniformization. We then give analytical results for the complexity and
efficiency of each method in terms of the instantaneous transition rate matrix
of the CTMC, its beginning and ending states, and the length of sampling
time . In doing so, we show that no method dominates the others across all
model specifications, and we give explicit proof of which method prevails for
any given and endpoints. Finally, we introduce and compare three
applications of CTMCs to demonstrate the pitfalls of choosing an inefficient
sampler.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS247 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Blood ties: ABO is a trans-species polymorphism in primates
The ABO histo-blood group, the critical determinant of transfusion
incompatibility, was the first genetic polymorphism discovered in humans.
Remarkably, ABO antigens are also polymorphic in many other primates, with the
same two amino acid changes responsible for A and B specificity in all species
sequenced to date. Whether this recurrence of A and B antigens is the result of
an ancient polymorphism maintained across species or due to numerous, more
recent instances of convergent evolution has been debated for decades, with a
current consensus in support of convergent evolution. We show instead that
genetic variation data in humans and gibbons as well as in Old World Monkeys
are inconsistent with a model of convergent evolution and support the
hypothesis of an ancient, multi-allelic polymorphism of which some alleles are
shared by descent among species. These results demonstrate that the ABO
polymorphism is a trans-species polymorphism among distantly related species
and has remained under balancing selection for tens of millions of years, to
date, the only such example in Hominoids and Old World Monkeys outside of the
Major Histocompatibility Complex.Comment: 45 pages, 4 Figures, 4 Supplementary Figures, 5 Supplementary Table
Recommended from our members
Combined burden and functional impact tests for cancer driver discovery using DriverPower
The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower's background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery
Skewed Exposure to Environmental Antigens Complements Hygiene Hypothesis in Explaining the Rise of Allergy
The Hygiene Hypothesis has been recognized as an important cornerstone to explain the sudden increase in the prevalence of asthma and allergic diseases in modernized culture. The recent epidemic of allergic diseases is in contrast with the gradual implementation of Homo sapiens sapiens to the present-day forms of civilization. This civilization forms a gradual process with cumulative effects on the human immune system, which co-developed with parasitic and commensal Helminths. The clinical manifestation of this epidemic, however, became only visible in the second half of the twentieth century. In order to explain these clinical effects in terms of the underlying IgE-mediated reactions to innocuous environmental antigens, the low biodiversity of antigens in the domestic environment plays a pivotal role. The skewing of antigen exposure as a cumulative effect of reducing biodiversity in the immediate human environment as well as in changing food habits, provides a sufficient and parsimonious explanation for the rise in allergic diseases in a highly developed and helminth-free modernized culture. Socio-economic tendencies that incline towards a further reduction of environmental biodiversity may provide serious concern for future health. This article explains that the “Hygiene Hypothesis”, the “Old Friends Hypothesis”, and the “Skewed Antigen Exposure Hypothesis” are required to more fully explain the rise of allergy in modern societies
Different paths to the modern state in Europe: the interaction between domestic political economy and interstate competition
Theoretical work on state formation and capacity has focused mostly on early modern Europe and on the experience of western European states during this period. While a number of European states monopolized domestic tax collection and achieved gains in state capacity during the early modern era, for others revenues stagnated or even declined, and these variations motivated alternative hypotheses for determinants of fiscal and state capacity. In this study we test the basic hypotheses in the existing literature making use of the large date set we have compiled for all of the leading states across the continent. We find strong empirical support for two prevailing threads in the literature, arguing respectively that interstate wars and changes in economic structure towards an urbanized economy had positive fiscal impact. Regarding the main point of contention in the theoretical literature, whether it was representative or authoritarian political regimes that facilitated the gains in fiscal capacity, we do not find conclusive evidence that one performed better than the other. Instead, the empirical evidence we have gathered lends supports to the hypothesis that when under pressure of war, the fiscal performance of representative regimes was better in the more urbanized-commercial economies and the fiscal performance of authoritarian regimes was better in rural-agrarian economie
Importance of incomplete lineage sorting and introgression in the origin of shared genetic variation between two closely related pines with overlapping distributions
Genetic variation shared between closely related species may be due to retention of ancestral polymorphisms because of incomplete lineage sorting (ILS) and/or introgression following secondary contact. It is challenging to distinguish ILS and introgression because they generate similar patterns of shared genetic diversity, but this is nonetheless essential for inferring accurately the history of species with overlapping distributions. To address this issue, we sequenced 33 independent intron loci across the genome of two closely related pine species (Pinus massoniana Lamb. and Pinus hwangshanensis Hisa) from Southeast China. Population structure analyses revealed that the species showed slightly more admixture in parapatric populations than in allopatric populations. Levels of interspecific differentiation were lower in parapatry than in allopatry. Approximate Bayesian computation suggested that the most likely speciation scenario explaining this pattern was a long period of isolation followed by a secondary contact. Ecological niche modeling suggested that a gradual range expansion of P. hwangshanensis during the Pleistocene climatic oscillations could have been the cause of the overlap. Our study therefore suggests that secondary introgression, rather than ILS, explains most of the shared nuclear genomic variation between these two species and demonstrates the complementarity of population genetics and ecological niche modeling in understanding gene flow history. Finally, we discuss the importance of contrasting results from markers with different dynamics of migration, namely nuclear, chloroplast and mitochondrial DNA
Extreme selective sweeps independently targeted the X chromosomes of the great apes
The unique inheritance pattern of the X chromosome exposes it to natural selection in a way that is different from that of the autosomes, potentially resulting in accelerated evolution. We perform a comparative analysis of X chromosome polymorphism in 10 great ape species, including humans. In most species, we identify striking megabase-wide regions, where nucleotide diversity is less than 20% of the chromosomal average. Such regions are found exclusively on the X chromosome. The regions overlap partially among species, suggesting that the underlying targets are partly shared among species. The regions have higher proportions of singleton SNPs, higher levels of population differentiation, and a higher nonsynonymous-to-synonymous substitution ratio than the rest of the X chromosome. We show that the extent to which diversity is reduced is incompatible with direct selection or the action of background selection and soft selective sweeps alone, and therefore, we suggest that very strong selective sweeps have independently targeted these specific regions in several species. The only genomic feature that we can identify as strongly associated with loss of diversity is the location of testis-expressed ampliconic genes, which also have reduced diversity around them. We hypothesize that these genes may be responsible for selective sweeps in the form of meiotic drive caused by an intragenomic conflict in male meiosis
Ancestral population genomics
Borrowing both from population genetics and phylogenetics, the field of population genomics emerged as full genomes of several closely related species were available. Providing we can properly model sequence evolution within populations undergoing speciation events, this resource enables us to estimate key population genetics parameters such as ancestral population sizes and split times. Furthermore we can enhance our understanding of the recombination process and investigate various selective forces. With the advent of resequencing technologies, genome-wide patterns of diversity in extant populations have now come to complement this picture, offering an increasing power to study more recent genetic history
- …
