602 research outputs found
Cancer somatic mutations cluster in a subset of regulatory sites predicted from the ENCODE data
Background: Transcriptional regulation of gene expression is essential for cellular differentiation and function, and defects in the process are associated with cancer. The ENCODE project has mapped potential regulatory sites across the complete genome in many cell types, and these regions have been shown to harbour many of the somatic mutations that occur in cancer cells, suggesting that their effects may drive cancer initiation and development. The ENCODE data suggests a very large number of regulatory sites, and methods are needed to identify those that are most relevant and to connect them to the genes that they control. Methods: Predictive models of gene expression were developed by integrating the ENCODE data for regulation, including transcription factor binding and DNase1 hypersensitivity, with RNA-seq data for gene expression. A penalized regression method was used to identify the most predictive potential regulatory sites for each transcript. Known cancer somatic mutations from the COSMIC database were mapped to potential regulatory sites, and we examined differences in the mapping frequencies associated with sites chosen in regulatory models and other (rejected) sites. The effects of potential confounders, for example replication timing, were considered. Results: Cancer somatic mutations preferentially occupy those regulatory regions chosen in our models as most predictive of gene expression. Conclusion: Our methods have identified a significantly reduced set of regulatory sites that are enriched in cancer somatic mutations and are more predictive of gene expression. This has significance for the mechanistic interpretation of cancer mutations, and the understanding of genetic regulation
Deriving a mutation index of carcinogenicity using protein structure and protein interfaces
With the advent of Next Generation Sequencing the identification of mutations in the genomes of healthy and diseased tissues has become commonplace. While much progress has been made to elucidate the aetiology of disease processes in cancer, the contributions to disease that many individual mutations make remain to be characterised and their downstream consequences on cancer phenotypes remain to be understood. Missense mutations commonly occur in cancers and their consequences remain challenging to predict. However, this knowledge is becoming more vital, for both assessing disease progression and for stratifying drug treatment regimes. Coupled with structural data, comprehensive genomic databases of mutations such as the 1000 Genomes project and COSMIC give an opportunity to investigate general principles of how cancer mutations disrupt proteins and their interactions at the molecular and network level. We describe a comprehensive comparison of cancer and neutral missense mutations; by combining features derived from structural and interface properties we have developed a carcinogenicity predictor, InCa (Index of Carcinogenicity). Upon comparison with other methods, we observe that InCa can predict mutations that might not be detected by other methods. We also discuss general limitations shared by all predictors that attempt to predict driver mutations and discuss how this could impact high-throughput predictions. A web interface to a server implementation is publicly available at http://inca.icr.ac.uk/
The driver landscape of sporadic chordoma
Chordoma is a malignant, often incurable bone tumour showing notochordal differentiation. Here, we defined the somatic driver landscape of 104 cases of sporadic chordoma. We reveal somatic duplications of the notochordal transcription factor brachyury (T) in up to 27% of cases. These variants recapitulate the rearrangement architecture of the pathogenic germline duplications of T that underlie familial chordoma. In addition, we find potentially clinically actionable PI3K signalling mutations in 16% of cases. Intriguingly, one of the most frequently altered genes, mutated exclusively by inactivating mutation, was LYST (10%), which may represent a novel cancer gene in chordoma
STK295900, a Dual Inhibitor of Topoisomerase 1 and 2, Induces G<inf>2</inf> Arrest in the Absence of DNA Damage
STK295900, a small synthetic molecule belonging to a class of symmetric bibenzimidazoles, exhibits antiproliferative activity against various human cancer cell lines from different origins. Examining the effect of STK295900 in HeLa cells indicates that it induces G2 phase arrest without invoking DNA damage. Further analysis shows that STK295900 inhibits DNA relaxation that is mediated by topoisomerase 1 (Top 1) and topoisomerase 2 (Top 2) in vitro. In addition, STK295900 also exhibits protective effect against DNA damage induced by camptothecin. However, STK295900 does not affect etoposide-induced DNA damage. Moreover, STK295900 preferentially exerts cytotoxic effect on cancer cell lines while camptothecin, etoposide, and Hoechst 33342 affected both cancer and normal cells. Therefore, STK295900 has a potential to be developed as an anticancer chemotherapeutic agent. © 2013 Kim et al
Exome sequencing of pleuropulmonary blastoma reveals frequent biallelic loss of TP53 and two hits in DICER1 resulting in retention of 5p-derived miRNA hairpin loop sequences
Pleuropulmonary blastoma is a rare childhood malignancy of lung mesenchymal cells that can remain dormant as epithelial cysts or progress to high-grade sarcoma. Predisposing germline loss-of-function DICER1 variants have been described. We sought to uncover additional contributors through whole exome sequencing of 15 tumor/normal pairs, followed by targeted resequencing, miRNA analysis and immunohistochemical analysis of additional tumors. In addition to frequent biallelic loss of TP53 and mutations of NRAS or BRAF in some cases, each case had compound disruption of DICER1: a germline (12 cases) or somatic (3 cases) loss-of-function variant plus a somatic missense mutation in the RNase IIIb domain. 5p-Derived microRNA (miRNA) transcripts retained abnormal precursor miRNA loop sequences normally removed by DICER1. This work both defines a genetic interaction landscape with DICER1 mutation and provides evidence for alteration in miRNA transcripts as a consequence of DICER1 disruption in cancer
Levels of DNA methylation vary at CpG sites across the BRCA1 promoter, and differ according to triple negative and "BRCA-like" status, in both blood and tumour DNA
Triple negative breast cancer is typically an aggressive and difficult to treat subtype. It is
often associated with loss of function of the BRCA1 gene, either through mutation, loss of
heterozygosity or methylation. This study aimed to measure methylation of the BRCA1
gene promoter at individual CpG sites in blood, tumour and normal breast tissue, to assess
whether levels were correlated between different tissues, and with triple negative receptor
status, histopathological scoring for BRCA-like features and BRCA1 protein expression.
Blood DNA methylation levels were significantly correlated with tumour methylation at 9 of
11 CpG sites examined (p<0.0007). The levels of tumour DNA methylation were significantly
higher in triple negative tumours, and in tumours with high BRCA-like histopathological
scores (10 of 11 CpG sites; p<0.01 and p<0.007 respectively). Similar results were
observed in blood DNA (6 of 11 CpG sites; p<0.03 and 7 of 11 CpG sites; p<0.02 respectively).
This study provides insight into the pattern of CpG methylation across the BRCA1
promoter, and supports previous studies suggesting that tumours with BRCA1 promoter
methylation have similar features to those with BRCA1 mutations, and therefore may be
suitable for the same targeted therapies
Medulloblastoma Exome Sequencing Uncovers Subtype-Specific Somatic Mutations
Medulloblastomas are the most common malignant brain tumors in children1. Identifying and understanding the genetic events that drive these tumors is critical for the development of more effective diagnostic, prognostic and therapeutic strategies. Recently, our group and others described distinct molecular subtypes of medulloblastoma based on transcriptional and copy number profiles2–5. Here, we utilized whole exome hybrid capture and deep sequencing to identify somatic mutations across the coding regions of 92 primary medulloblastoma/normal pairs. Overall, medulloblastomas exhibit low mutation rates consistent with other pediatric tumors, with a median of 0.35 non-silent mutations per megabase. We identified twelve genes mutated at statistically significant frequencies, including previously known mutated genes in medulloblastoma such as CTNNB1, PTCH1, MLL2, SMARCA4 and TP53. Recurrent somatic mutations were identified in an RNA helicase gene, DDX3X, often concurrent with CTNNB1 mutations, and in the nuclear co-repressor (N-CoR) complex genes GPS2, BCOR, and LDB1, novel findings in medulloblastoma. We show that mutant DDX3X potentiates transactivation of a TCF promoter and enhances cell viability in combination with mutant but not wild type beta-catenin. Together, our study reveals the alteration of Wnt, Hedgehog, histone methyltransferase and now N-CoR pathways across medulloblastomas and within specific subtypes of this disease, and nominates the RNA helicase DDX3X as a component of pathogenic beta-catenin signaling in medulloblastoma
Therapeutic opportunities within the DNA damage response
The DNA damage response (DDR) is essential for maintaining the genomic integrity of the cell, and its disruption is one of the hallmarks of cancer. Classically, defects in the DDR have been exploited therapeutically in the treatment of cancer with radiation therapies or genotoxic chemotherapies. More recently, protein components of the DDR systems have been identified as promising avenues for targeted cancer therapeutics. Here, we present an in-depth analysis of the function, role in cancer and therapeutic potential of 450 expert-curated human DDR genes. We discuss the DDR drugs that have been approved by the US Food and Drug Administration (FDA) or that are under clinical investigation. We examine large-scale genomic and expression data for 15 cancers to identify deregulated components of the DDR, and we apply systematic computational analysis to identify DDR proteins that are amenable to modulation by small molecules, highlighting potential novel therapeutic targets
Federated Ensemble Regression Using Classification
Ensemble learning has been shown to significantly improve predictive accuracy in a variety of machine learning problems. For a given predictive task, the goal of ensemble learning is to improve predictive accuracy by combining the predictive power of multiple models. In this paper, we present an ensemble learning algorithm for regression problems which leverages the distribution of the samples in a learning set to achieve improved performance. We apply the proposed algorithm to a problem in precision medicine where the goal is to predict drug perturbation effects on genes in cancer cell lines. The proposed approach significantly outperforms the base case
COSMIC 2005
The Catalogue Of Somatic Mutations In Cancer (COSMIC) database and web site was developed to preserve somatic mutation data and share it with the community. Over the past 25 years, approximately 350 cancer genes have been identified, of which 311 are somatically mutated. COSMIC has been expanded and now holds data previously reported in the scientific literature for 28 known cancer genes. In addition, there is data from the systematic sequencing of 518 protein kinase genes. The total gene count in COSMIC stands at 538; 25 have a mutation frequency above 5% in one or more tumour type, no mutations were found in 333 genes and 180 are rarely mutated with frequencies <5% in any tumour set. The COSMIC web site has been expanded to give more views and summaries of the data and provide faster query routes and downloads. In addition, there is a new section describing mutations found through a screen of known cancer genes in 728 cancer cell lines including the NCI-60 set of cancer cell lines
- …
