254 research outputs found
The role of mutation rate variation and genetic diversity in the architecture of human disease
Background
We have investigated the role that the mutation rate and the structure of genetic variation at a locus play in determining whether a gene is involved in disease. We predict that the mutation rate and its genetic diversity should be higher in genes associated with disease, unless all genes that could cause disease have already been identified.
Results
Consistent with our predictions we find that genes associated with Mendelian and complex disease are substantially longer than non-disease genes. However, we find that both Mendelian and complex disease genes are found in regions of the genome with relatively low mutation rates, as inferred from intron divergence between humans and chimpanzees, and they are predicted to have similar rates of non-synonymous mutation as other genes. Finally, we find that disease genes are in regions of significantly elevated genetic diversity, even when variation in the rate of mutation is controlled for. The effect is small nevertheless.
Conclusions
Our results suggest that gene length contributes to whether a gene is associated with disease. However, the mutation rate and the genetic architecture of the locus appear to play only a minor role in determining whether a gene is associated with disease
Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans
It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. However, it is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We investi- gate a number of questions pertaining to the distribution of mutations using more than 130,000 DNMs from three large datasets. We demonstrate that the amount and pattern of variation differs between datasets at the 1MB and 100KB scales probably as a consequence of differences in sequencing technology and processing. In particular, datasets show differ- ent patterns of correlation to genomic variables such as replication time. Never-the-less there are many commonalities between datasets, which likely represent true patterns. We show that there is variation in the mutation rate at the 100KB, 1MB and 10MB scale that can- not be explained by variation at smaller scales, however the level of this variation is modest at large scales–at the 1MB scale we infer that ~90% of regions have a mutation rate within 50% of the mean. Different types of mutation show similar levels of variation and appear to vary in concert which suggests the pattern of mutation is relatively constant across the genome. We demonstrate that variation in the mutation rate does not generate large-scale variation in GC-content, and hence that mutation bias does not maintain the isochore struc- ture of the human genome. We find that genomic features explain less than 40% of the explainable variance in the rate of DNM. As expected the rate of divergence between spe- cies is correlated to the rate of DNM. However, the correlations are weaker than expected if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. In contrast to divergence, we find that most of the variation in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered
Protein interaction network of alternatively spliced isoforms from brain links genetic risk factors for autism
Increased risk for autism spectrum disorders (ASD) is attributed to hundreds of genetic loci. The convergence of ASD variants have been investigated using various approaches, including protein interactions extracted from the published literature. However, these datasets are frequently incomplete, carry biases and are limited to interactions of a single splicing isoform, which may not be expressed in the disease-relevant tissue. Here we introduce a new interactome mapping approach by experimentally identifying interactions between brain-expressed alternatively spliced variants of ASD risk factors. The Autism Spliceform Interaction Network reveals that almost half of the detected interactions and about 30% of the newly identified interacting partners represent contribution from splicing variants, emphasizing the importance of isoform networks. Isoform interactions greatly contribute to establishing direct physical connections between proteins from the de novo autism CNVs. Our findings demonstrate the critical role of spliceform networks for translating genetic knowledge into a better understanding of human diseases
Population genetic analysis of bi-allelic structural variants from low-coverage sequence data with an expectation-maximization algorithm
Background Population genetics and association studies usually rely on a set of known variable sites that are then genotyped in subsequent samples, because it is easier to genotype than to discover the variation. This is also true for structural variation detected from sequence data. However, the genotypes at known variable sites can only be inferred with uncertainty from low coverage data. Thus, statistical approaches that infer genotype likelihoods, test hypotheses, and estimate population parameters without requiring accurate genotypes are becoming popular. Unfortunately, the current implementations of these methods are intended to analyse only single nucleotide and short indel variation, and they usually assume that the two alleles in a heterozygous individual are sampled with equal probability. This is generally false for structural variants detected with paired ends or split reads. Therefore, the population genetics of structural variants cannot be studied, unless a painstaking and potentially biased genotyping is performed first. Results We present svgem, an expectation-maximization implementation to estimate allele and genotype frequencies, calculate genotype posterior probabilities, and test for Hardy-Weinberg equilibrium and for population differences, from the numbers of times the alleles are observed in each individual. Although applicable to single nucleotide variation, it aims at bi-allelic structural variation of any type, observed by either split reads or paired ends, with arbitrarily high allele sampling bias. We test svgem with simulated and real data from the 1000 Genomes Project. Conclusions svgem makes it possible to use low-coverage sequencing data to study the population distribution of structural variants without having to know their genotypes. Furthermore, this advance allows the combined analysis of structural and nucleotide variation within the same genotype-free statistical framework, thus preventing biases introduced by genotype imputation
Fitness Consequences of Advanced Ancestral Age over Three Generations in Humans
A rapid rise in age at parenthood in contemporary societies has increased interest in reports of higher prevalence of de novo mutations and health problems in individuals with older fathers, but the fitness consequences of such age effects over several generations remain untested. Here, we use extensive pedigree data on seven pre-industrial Finnish populations to show how the ages of ancestors for up to three generations are associated with fitness traits. Individuals whose fathers, grandfathers and great-grandfathers fathered their lineage on average under age 30 were ~13% more likely to survive to adulthood than those whose ancestors fathered their lineage at over 40 years. In addition, females had a lower probability of marriage if their male ancestors were older. These findings are consistent with an increase of the number of accumulated de novo mutations with male age, suggesting that deleterious mutations acquired from recent ancestors may be a substantial burden to fitness in humans. However, possible non-mutational explanations for the observed associations are also discussed
Mouse HORMAD1 and HORMAD2, two conserved meiotic chromosomal proteins, are depleted from synapsed chromosome axes with the help of TRIP13 AAA-ATPase
Meiotic crossovers are produced when programmed double-strand breaks (DSBs) are repaired by recombination from homologous chromosomes (homologues). In a wide variety of organisms, meiotic HORMA-domain proteins are required to direct DSB repair towards homologues. This inter-homologue bias is required for efficient homology search, homologue alignment, and crossover formation. HORMA-domain proteins are also implicated in other processes related to crossover formation, including DSB formation, inhibition of promiscuous formation of the synaptonemal complex (SC), and the meiotic prophase checkpoint that monitors both DSB processing and SCs. We examined the behavior of two previously uncharacterized meiosis-specific mouse HORMA-domain proteins-HORMAD1 and HORMAD2-in wild-type mice and in mutants defective in DSB processing or SC formation. HORMADs are preferentially associated with unsynapsed chromosome axes throughout meiotic prophase. We observe a strong negative correlation between SC formation and presence of HORMADs on axes, and a positive correlation between the presumptive sites of high checkpoint-kinase ATR activity and hyper-accumulation of HORMADs on axes. HORMADs are not depleted from chromosomes in mutants that lack SCs. In contrast, DSB formation and DSB repair are not absolutely required for depletion of HORMADs from synapsed axes. A simple interpretation of these findings is that SC formation directly or indirectly promotes depletion of HORMADs from chromosome axes. We also find that TRIP13 protein is required for reciprocal distribution of HORMADs and the SYCP1/SC-component along chromosome axes. Similarities in mouse and budding yeast meiosis suggest that TRIP13/Pch2 proteins have a conserved role in establishing mutually exclusive HORMAD-rich and synapsed chromatin domains in both mouse and yeast. Taken together, our observations raise the possibility that involvement of meiotic HORMA-domain proteins in the regulation of homologue interactions is conserved in mammals
A Cell Motility Screen Reveals Role for MARCKS-Related Protein in Adherens Junction Formation and Tumorigenesis
Invasion through the extracellular matrix (ECM) is important for wound healing, immunological responses and metastasis. We established an invasion-based cell motility screen using Boyden chambers overlaid with Matrigel to select for pro-invasive genes. By this method we identified antisense to MARCKS related protein (MRP), whose family member MARCKS is a target of miR-21, a microRNA involved in tumor growth, invasion and metastasis in multiple human cancers. We confirmed that targeted knockdown of MRP, in both EpRas mammary epithelial cells and PC3 prostate cancer cells, promoted in vitro cell migration that was blocked by trifluoperazine. Additionally, we observed increased immunofluoresence of E-cadherin, β-catenin and APC at sites of cell-cell contact in EpRas cells with MRP knockdown suggesting formation of adherens junctions. By wound healing assay we observed that reduced MRP supported collective cell migration, a type of cell movement where adherens junctions are maintained. However, destabilized adherens junctions, like those seen in EpRas cells, are frequently important for oncogenic signaling. Consequently, knockdown of MRP in EpRas caused loss of tumorigenesis in vivo, and reduced Wnt3a induced TCF reporter signaling in vitro. Together our data suggest that reducing MRP expression promotes formation of adherens junctions in EpRas cells, allowing collective cell migration, but interferes with oncogenic β-catenin signaling and tumorigenesis
Integrative Analysis of Low- and High-Resolution eQTL
The study of expression quantitative trait loci (eQTL) is a powerful way of detecting transcriptional regulators at a genomic scale and for elucidating how natural genetic variation impacts gene expression. Power and genetic resolution are heavily affected by the study population: whereas recombinant inbred (RI) strains yield greater statistical power with low genetic resolution, using diverse inbred or outbred strains improves genetic resolution at the cost of lower power. In order to overcome the limitations of both individual approaches, we combine data from RI strains with genetically more diverse strains and analyze hippocampus eQTL data obtained from mouse RI strains (BXD) and from a panel of diverse inbred strains (Mouse Diversity Panel, MDP). We perform a systematic analysis of the consistency of eQTL independently obtained from these two populations and demonstrate that a significant fraction of eQTL can be replicated. Based on existing knowledge from pathway databases we assess different approaches for using the high-resolution MDP data for fine mapping BXD eQTL. Finally, we apply this framework to an eQTL hotspot on chromosome 1 (Qrr1), which has been implicated in a range of neurological traits. Here we present the first systematic examination of the consistency between eQTL obtained independently from the BXD and MDP populations. Our analysis of fine-mapping approaches is based on ‘real life’ data as opposed to simulated data and it allows us to propose a strategy for using MDP data to fine map BXD eQTL. Application of this framework to Qrr1 reveals that this eQTL hotspot is not caused by just one (or few) ‘master regulators’, but actually by a set of polymorphic genes specific to the central nervous system
Phosphatase of Regenerating Liver-3 Localizes to Cyto-Membrane and Is Required for B16F1 Melanoma Cell Metastasis In Vitro and In Vivo
BACKGROUND: Phosphatase of regenerating liver-3 (PRL-3) is a member of the novel phosphatases of regenerating liver family, characterized by one protein tyrosine phosphatase active domain and a C-terminal prenylation (CCVM) motif. Though widely proposed to facilitate metastasis in many cancer types, PRL-3's cellular localization and the function of its CCVM motif in metastatic process remain unknown. METHODOLOGY/PRINCIPAL FINDINGS: In the present study, a series of Myc tagged PRL-3 wild type or mutant plasmids were expressed in B16F1 melanoma cells to investigate the relationship between PRL-3's cellular localization and metastasis. With immuno-fluorescence microcopy and cell adhesion/migration assay in vitro, and an experimental passive metastasis model in vivo, we found that CCVM motif is critical for the localization of PRL-3 on cell plasma membrane and the lung metastasis of melanoma. In particular, Cystine170 is the key site for prenylation in this process. CONCLUSIONS/SIGNIFICANCE: These results suggest that cellular localization of PRL-3 is highly correlated with its function in tumor metastasis, and inhibition of PRL-3 prenylation might be a new approach to cancer therapy
A framework for the detection of de novo mutations in family-based sequencing data
Francioli LC, Cretu-Stancu M, Garimella KV, et al. A framework for the detection of de novo mutations in family-based sequencing data. European Journal of Human Genetics. 2016;25(2):227-233
- …
