199 research outputs found
Selective Constraint on Noncoding Regions of Hominid Genomes
An important challenge for human evolutionary biology is to understand the genetic basis of human–chimpanzee differences. One influential idea holds that such differences depend, to a large extent, on adaptive changes in gene expression. An important step in assessing this hypothesis involves gaining a better understanding of selective constraint on noncoding regions of hominid genomes. In noncoding sequence, functional elements are frequently small and can be separated by large nonfunctional regions. For this reason, constraint in hominid genomes is likely to be patchy. Here we use conservation in more distantly related mammals and amniotes as a way of identifying small sequence windows that are likely to be functional. We find that putatively functional noncoding elements defined in this manner are subject to significant selective constraint in hominids
A Macaque's-Eye View of Human Insertions and Deletions: Differences in Mechanisms
Insertions and deletions (indels) cause numerous genetic diseases and lead to pronounced evolutionary differences among genomes. The macaque sequences provide an opportunity to gain insights into the mechanisms generating these mutations on a genome-wide scale by establishing the polarity of indels occurring in the human lineage since its divergence from the chimpanzee. Here we apply novel regression techniques and multiscale analyses to demonstrate an extensive regional indel rate variation stemming from local fluctuations in divergence, GC content, male and female recombination rates, proximity to telomeres, and other genomic factors. We find that both replication and, surprisingly, recombination are significantly associated with the occurrence of small indels. Intriguingly, the relative inputs of replication versus recombination differ between insertions and deletions, thus the two types of mutations are likely guided in part by distinct mechanisms. Namely, insertions are more strongly associated with factors linked to recombination, while deletions are mostly associated with replication-related features. Indel as a term misleadingly groups the two types of mutations together by their effect on a sequence alignment. However, here we establish that the correct identification of a small gap as an insertion or a deletion (by use of an outgroup) is crucial to determining its mechanism of origin. In addition to providing novel insights into insertion and deletion mutagenesis, these results will assist in gap penalty modeling and eventually lead to more reliable genomic alignments
Heterotachy in Mammalian Promoter Evolution
We have surveyed the evolutionary trends of mammalian promoters and upstream sequences, utilising large sets of experimentally supported transcription start sites (TSSs). With 30,969 well-defined TSSs from mouse and 26,341 from human, there are sufficient numbers to draw statistically meaningful conclusions and to consider differences between promoter types. Unlike previous smaller studies, we have considered the effects of insertions, deletions, and transposable elements as well as nucleotide substitutions. The rate of promoter evolution relative to that of control sequences has not been consistent between lineages nor within lineages over time. The most pronounced manifestation of this heterotachy is the increased rate of evolution in primate promoters. This increase is seen across different classes of mutation, including substitutions and micro-indel events. We investigated the relationship between promoter and coding sequence selective constraint and suggest that they are generally uncorrelated. This analysis also identified a small number of mouse promoters associated with the immune response that are under positive selection in rodents. We demonstrate significant differences in divergence between functional promoter categories and identify a category of promoters, not associated with conventional protein-coding genes, that has the highest rates of divergence across mammals. We find that evolutionary rates vary both on a fine scale within mammalian promoters and also between different functional classes of promoters. The discovery of heterotachy in promoter evolution, in particular the accelerated evolution of primate promoters, has important implications for our understanding of human evolution and for strategies to detect primate-specific regulatory elements
Comparing Patterns of Natural Selection across Species Using Selective Signatures
Comparing gene expression profiles over many different conditions has led to insights that were not obvious from single experiments. In the same way, comparing patterns of natural selection across a set of ecologically distinct species may extend what can be learned from individual genome-wide surveys. Toward this end, we show how variation in protein evolutionary rates, after correcting for genome-wide effects such as mutation rate and demographic factors, can be used to estimate the level and types of natural selection acting on genes across different species. We identify unusually rapidly and slowly evolving genes, relative to empirically derived genome-wide and gene family-specific background rates for 744 core protein families in 30 γ-proteobacterial species. We describe the pattern of fast or slow evolution across species as the “selective signature” of a gene. Selective signatures represent a profile of selection across species that is predictive of gene function: pairs of genes with correlated selective signatures are more likely to share the same cellular function, and genes in the same pathway can evolve in concert. For example, glycolysis and phenylalanine metabolism genes evolve rapidly in Idiomarina loihiensis, mirroring an ecological shift in carbon source from sugars to amino acids. In a broader context, our results suggest that the genomic landscape is organized into functional modules even at the level of natural selection, and thus it may be easier than expected to understand the complex evolutionary pressures on a cell
Alu Recombination-Mediated Structural Deletions in the Chimpanzee Genome
With more than 1.2 million copies, Alu elements are one of the most important sources of structural variation in primate genomes. Here, we compare the chimpanzee and human genomes to determine the extent of Alu recombination-mediated deletion (ARMD) in the chimpanzee genome since the divergence of the chimpanzee and human lineages (∼6 million y ago). Combining computational data analysis and experimental verification, we have identified 663 chimpanzee lineage-specific deletions (involving a total of ∼771 kb of genomic sequence) attributable to this process. The ARMD events essentially counteract the genomic expansion caused by chimpanzee-specific Alu inserts. The RefSeq databases indicate that 13 exons in six genes, annotated as either demonstrably or putatively functional in the human genome, and 299 intronic regions have been deleted through ARMDs in the chimpanzee lineage. Therefore, our data suggest that this process may contribute to the genomic and phenotypic diversity between chimpanzees and humans. In addition, we found four independent ARMD events at orthologous loci in the gorilla or orangutan genomes. This suggests that human orthologs of loci at which ARMD events have already occurred in other nonhuman primate genomes may be “at-risk” motifs for future deletions, which may subsequently contribute to human lineage-specific genetic rearrangements and disorders
A Map of Recent Positive Selection in the Human Genome
The identification of signals of very recent positive selection provides information about the adaptation of modern humans to local conditions. We report here on a genome-wide scan for signals of very recent positive selection in favor of variants that have not yet reached fixation. We describe a new analytical method for scanning single nucleotide polymorphism (SNP) data for signals of recent selection, and apply this to data from the International HapMap Project. In all three continental groups we find widespread signals of recent positive selection. Most signals are region-specific, though a significant excess are shared across groups. Contrary to some earlier low resolution studies that suggested a paucity of recent selection in sub-Saharan Africans, we find that by some measures our strongest signals of selection are from the Yoruba population. Finally, since these signals indicate the existence of genetic variants that have substantially different fitnesses, they must indicate loci that are the source of significant phenotypic variation. Though the relevant phenotypes are generally not known, such loci should be of particular interest in mapping studies of complex traits. For this purpose we have developed a set of SNPs that can be used to tag the strongest ∼250 signals of recent selection in each population
Allele Frequency Matching Between SNPs Reveals an Excess of Linkage Disequilibrium in Genic Regions of the Human Genome
Significant interest has emerged in mapping genetic susceptibility for complex traits through whole-genome association studies. These studies rely on the extent of association, i.e., linkage disequilibrium (LD), between single nucleotide polymorphisms (SNPs) across the human genome. LD describes the nonrandom association between SNP pairs and can be used as a metric when designing maximally informative panels of SNPs for association studies in human populations. Using data from the 1.58 million SNPs genotyped by Perlegen, we explored the allele frequency dependence of the LD statistic r (2) both empirically and theoretically. We show that average r (2) values between SNPs unmatched for allele frequency are always limited to much less than 1 (theoretical [Image: see text] approximately 0.46 to 0.57 for this dataset). Frequency matching of SNP pairs provides a more sensitive measure for assessing the average decay of LD and generates average r (2) values across nearly the entire informative range (from 0 to 0.89 through 0.95). Additionally, we analyzed the extent of perfect LD (r (2) = 1.0) using frequency-matched SNPs and found significant differences in the extent of LD in genic regions versus intergenic regions. The SNP pairs exhibiting perfect LD showed a significant bias for derived, nonancestral alleles, providing evidence for positive natural selection in the human genome
The Molecular Anatomy of Spontaneous Germline Mutations in Human Testes
The frequency of the most common sporadic Apert syndrome mutation (C755G) in the human fibroblast growth factor receptor 2 gene (FGFR2) is 100–1,000 times higher than expected from average nucleotide substitution rates based on evolutionary studies and the incidence of human genetic diseases. To determine if this increased frequency was due to the nucleotide site having the properties of a mutation hot spot, or some other explanation, we developed a new experimental approach. We examined the spatial distribution of the frequency of the C755G mutation in the germline by dividing four testes from two normal individuals each into several hundred pieces, and, using a highly sensitive PCR assay, we measured the mutation frequency of each piece. We discovered that each testis was characterized by rare foci with mutation frequencies 103 to >104 times higher than the rest of the testis regions. Using a model based on what is known about human germline development forced us to reject (p < 10−6) the idea that the C755G mutation arises more frequently because this nucleotide simply has a higher than average mutation rate (hot spot model). This is true regardless of whether mutation is dependent or independent of cell division. An alternate model was examined where positive selection acts on adult self-renewing Ap spermatogonial cells (SrAp) carrying this mutation such that, instead of only replacing themselves, they occasionally produce two SrAp cells. This model could not be rejected given our observed data. Unlike the disease site, similar analysis of C-to-G mutations at a control nucleotide site in one testis pair failed to find any foci with high mutation frequencies. The rejection of the hot spot model and lack of rejection of a selection model for the C755G mutation, along with other data, provides strong support for the proposal that positive selection in the testis can act to increase the frequency of premeiotic germ cells carrying a mutation deleterious to an offspring, thereby unfavorably altering the mutational load in humans. Studying the anatomical distribution of germline mutations can provide new insights into genetic disease and evolutionary change
On the Origin and Evolution of Vertebrate Olfactory Receptor Genes: Comparative Genome Analysis Among 23 Chordate Species
Olfaction is a primitive sense in organisms. Both vertebrates and insects have
receptors for detecting odor molecules in the environment, but the evolutionary
origins of these genes are different. Among studied vertebrates, mammals have
∼1,000 olfactory receptor (OR) genes, whereas teleost fishes have much
smaller (∼100) numbers of OR genes. To investigate the origin and
evolution of vertebrate OR genes, I attempted to determine near-complete OR gene
repertoires by searching whole-genome sequences of 14 nonmammalian chordates,
including cephalochordates (amphioxus), urochordates (ascidian and larvacean),
and vertebrates (sea lamprey, elephant shark, five teleost fishes, frog, lizard,
and chicken), followed by a large-scale phylogenetic analysis in conjunction
with mammalian OR genes identified from nine species. This analysis showed that
the amphioxus has >30 vertebrate-type OR genes though it lacks
distinctive olfactory organs, whereas all OR genes appear to have been lost in
the urochordate lineage. Some groups of genes (θ, κ, and
λ) that are phylogenetically nested within vertebrate OR genes showed
few gene gains and losses, which is in sharp contrast to the evolutionary
pattern of OR genes, suggesting that they are actually non-OR genes. Moreover,
the analysis demonstrated a great difference in OR gene repertoires between
aquatic and terrestrial vertebrates, reflecting the necessity for the detection
of water-soluble and airborne odorants, respectively. However, a minor group
(β) of genes that are atypically present in both aquatic and
terrestrial vertebrates was also found. These findings should provide a critical
foundation for further physiological, behavioral, and evolutionary studies of
olfaction in various organisms
Forces Shaping the Fastest Evolving Regions in the Human Genome
Comparative genomics allow us to search the human genome for segments that were extensively changed in the last ~5 million years since divergence from our common ancestor with chimpanzee, but are highly conserved in other species and thus are likely to be functional. We found 202 genomic elements that are highly conserved in vertebrates but show evidence of significantly accelerated substitution rates in human. These are mostly in non-coding DNA, often near genes associated with transcription and DNA binding. Resequencing confirmed that the five most accelerated elements are dramatically changed in human but not in other primates, with seven times more substitutions in human than in chimp. The accelerated elements, and in particular the top five, show a strong bias for adenine and thymine to guanine and cytosine nucleotide changes and are disproportionately located in high recombination and high guanine and cytosine content environments near telomeres, suggesting either biased gene conversion or isochore selection. In addition, there is some evidence of directional selection in the regions containing the two most accelerated regions. A combination of evolutionary forces has contributed to accelerated evolution of the fastest evolving elements in the human genome
- …
