33 research outputs found
Master of Science
thesisDespite recent large-scale proling efforts, the best prognostic predictor of glioblastoma multiforme (GBM) remains the patient's age at diagnosis. We describe a global pattern of tumor-exclusive co-occurring copy-number alterations (CNAs) that is correlated, possibly coordinated with GBM patients' survival and response to chemotherapy. The pattern is revealed by GSVD comparison of patient-matched but probe-independent GBM and normal aCGH datasets from The Cancer Genome Atlas (TCGA). We find that, first, the GSVD, formulated as a framework for comparatively modeling two composite datasets, removes from the pattern copy-number variations (CNVs) that occur in the normal human genome (e.g., female-specic X chromosome amplication) and experimental variations (e.g., in tissue batch, genomic center, hybridization date and scanner), without a-priori knowledge of these variations. Second, the pattern includes most known GBM-associated changes in chromosome numbers and focal CNAs, as well as several previously unreported CNAs in >3% of the patients. These include the biochemically putative drug target, cell cycleregulated serine/threonine kinase-encoding TLK2, the cyclin E1-encoding CCNE1, and the Rb-binding histone demethylase-encoding KDM5A. Third, the pattern provides a better prognostic predictor than the chromosome numbers or any one focal CNA that it identifies, suggesting that the GBM survival phenotype is an outcome of its global genotype. The pattern is independent of age, and combined with age, makes a better predictor than age alone. GSVD comparison of matched proles of a larger set of TCGA patients, inclusive of the initial set, confirms the global pattern. GSVD classication of the GBM profiles of an independent set of patients validates the prognostic contribution of the pattern
Molecular phylogeny of Indo‐Pacific carpenter ants (Hymenoptera: Formicidae, Camponotus) reveals waves of dispersal and colonization from diverse source areas
Ants that resemble Camponotus maculatus (Fabricius, 1782) present an opportunity to test the hypothesis that the origin of the Pacific island fauna was primarily New Guinea, the Philippines, and the Indo‐Malay archipelago (collectively known as Malesia). We sequenced two mitochondrial and four nuclear markers from 146 specimens from Pacific islands, Australia, and Malesia. We also added 211 specimens representing a larger worldwide sample and performed a series of phylogenetic analyses and ancestral area reconstructions. Results indicate that the Pacific members of this group comprise several robust clades that have distinctly different biogeographical histories, and they suggest an important role for Australia as a source of Pacific colonizations. Malesian areas were recovered mostly in derived positions, and one lineage appears to be Neotropical. Phylogenetic hypotheses indicate that the orange, pan‐Pacific form commonly identified as C. chloroticus Emery 1897 actually consists of two distantly related lineages. Also, the lineage on Hawaiʻi, which has been called C. variegatus (Smith, 1858), appears to be closely related to C. tortuganus Emery, 1895 in Florida and other lineages in the New World. In Micronesia and Polynesia the C. chloroticus‐like species support predictions of the taxon‐cycle hypothesis and could be candidates for human‐mediated dispersal.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/112260/1/cla12099-sup-0002-FigureS2.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/112260/2/cla12099-sup-0003-FigureS3.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/112260/3/cla12099-sup-0001-FigureS1.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/112260/4/cla12099-sup-0004-FigureS4.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/112260/5/cla12099-sup-0005-FigureS5.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/112260/6/cla12099-sup-0006-FigureS6.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/112260/7/cla12099.pd
GSVD Comparison of Patient-Matched Normal and Tumor aCGH Profiles Reveals Global Copy-Number Alterations Predicting Glioblastoma Multiforme Survival
Despite recent large-scale profiling efforts, the best prognostic predictor of glioblastoma multiforme (GBM) remains the patient's age at diagnosis. We describe a global pattern of tumor-exclusive co-occurring copy-number alterations (CNAs) that is correlated, possibly coordinated with GBM patients' survival and response to chemotherapy. The pattern is revealed by GSVD comparison of patient-matched but probe-independent GBM and normal aCGH datasets from The Cancer Genome Atlas (TCGA). We find that, first, the GSVD, formulated as a framework for comparatively modeling two composite datasets, removes from the pattern copy-number variations (CNVs) that occur in the normal human genome (e.g., female-specific X chromosome amplification) and experimental variations (e.g., in tissue batch, genomic center, hybridization date and scanner), without a-priori knowledge of these variations. Second, the pattern includes most known GBM-associated changes in chromosome numbers and focal CNAs, as well as several previously unreported CNAs in 3% of the patients. These include the biochemically putative drug target, cell cycle-regulated serine/threonine kinase-encoding TLK2, the cyclin E1-encoding CCNE1, and the Rb-binding histone demethylase-encoding KDM5A. Third, the pattern provides a better prognostic predictor than the chromosome numbers or any one focal CNA that it identifies, suggesting that the GBM survival phenotype is an outcome of its global genotype. The pattern is independent of age, and combined with age, makes a better predictor than age alone. GSVD comparison of matched profiles of a larger set of TCGA patients, inclusive of the initial set, confirms the global pattern. GSVD classification of the GBM profiles of an independent set of patients validates the prognostic contribution of the pattern
Invasive Plants and Enemy Release: Evolution of Trait Means and Trait Correlations in Ulex europaeus
Several hypotheses that attempt to explain invasive processes are based on the fact that plants have been introduced without their natural enemies. Among them, the EICA (Evolution of Increased Competitive Ability) hypothesis is the most influential. It states that, due to enemy release, exotic plants evolve a shift in resource allocation from defence to reproduction or growth. In the native range of the invasive species Ulex europaeus, traits involved in reproduction and growth have been shown to be highly variable and genetically correlated. Thus, in order to explore the joint evolution of life history traits and susceptibility to seed predation in this species, we investigated changes in both trait means and trait correlations. To do so, we compared plants from native and invaded regions grown in a common garden. According to the expectations of the EICA hypothesis, we observed an increase in seedling height. However, there was little change in other trait means. By contrast, correlations exhibited a clear pattern: the correlations between life history traits and infestation rate by seed predators were always weaker in the invaded range than in the native range. In U. europaeus, the role of enemy release in shaping life history traits thus appeared to imply trait correlations rather than trait means. In the invaded regions studied, the correlations involving infestation rates and key life history traits such as flowering phenology, growth and pod density were reduced, enabling more independent evolution of these key traits and potentially facilitating local adaptation to a wide range of environments. These results led us to hypothesise that a relaxation of genetic correlations may be implied in the expansion of invasive species
Significant probelets and corresponding tumor and normal arraylets uncovered by GSVD of the patient-matched GBM and normal aCGH profiles.
<p>(<i>a</i>) Plot of the second tumor arraylet describes a global pattern of tumor-exclusive co-occurring CNAs across the tumor probes. The probes are ordered, and their copy numbers are colored, according to each probe's chromosomal location. Segments (black lines) identified by circular binary segmentation (CBS) <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Olshen1" target="_blank">[20]</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Venkatraman1" target="_blank">[21]</a> include most known GBM-associated focal CNAs (black), e.g., <i>EGFR</i> amplification. CNAs previously unrecognized in GBM (red) include an amplification of a segment containing the biochemically putative drug target-encoding <i>TLK2</i>. (<i>b</i>) Plot of the second most tumor-exclusive probelet, which is also the most significant probelet in the tumor dataset (Figure S1<i>a</i> in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098.s001" target="_blank">Appendix S1</a>), describes the corresponding variation across the patients. The patients are ordered and classified according to each patient's relative copy number in this probelet. There are 227 patients (blue) with high (0.02) and 23 patients (red) with low, approximately zero, numbers in the second probelet. One patient (gray) remains unclassified with a large negative (−0.02) number. This classification significantly correlates with GBM survival times (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone-0030098-g003" target="_blank">Figure 3<i>a</i></a> and Table S1 in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098.s001" target="_blank">Appendix S1</a>). (<i>c</i>) Raster display of the tumor dataset, with relative gain (red), no change (black) and loss (green) of DNA copy numbers, shows the correspondence between the GBM profiles and the second probelet and tumor arraylet. Chromosome 7 gain and losses of chromosomes 9p and 10, which are dominant in the second tumor arraylet (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone-0030098-g002" target="_blank">Figure 2<i>a</i></a>), are negligible in the patients with low copy numbers in the second probelet, but distinct in the remaining patients (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone-0030098-g002" target="_blank">Figure 2<i>b</i></a>). This illustrates that the copy numbers listed in the second probelet correspond to the weights of the second tumor arraylet in the GBM profiles of the patients. (<i>d</i>) Plot of the 246th normal arraylet describes an X chromosome-exclusive amplification across the normal probes. (<i>e</i>) Plot of the 246th probelet, which is approximately common to both the normal and tumor datasets, and is the second most significant in the normal dataset (Figure S1<i>b</i> in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098.s001" target="_blank">Appendix S1</a>), describes the corresponding copy-number amplification in the female (red) relative to the male (blue) patients. Classification of the patients by the 246th probelet agrees with the copy-number gender assignments (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone-0030098-t001" target="_blank">Table 1</a> and Figure S9 in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098.s001" target="_blank">Appendix S1</a>), also for three patients with missing TCGA gender annotations and three additional patients with conflicting TCGA annotations and copy-number gender assignments. (<i>f</i>) Raster display of the normal dataset shows the correspondence between the normal profiles and the 246th probelet and normal arraylet. X chromosome amplification, which is dominant in the 246th normal arraylet (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone-0030098-g002" target="_blank">Figure 2<i>d</i></a>), is distinct in the female but nonexisting in the male patients (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone-0030098-g002" target="_blank">Figure 2<i>e</i></a>). Note also that although the tumor samples exhibit female-specific X chromosome amplification (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone-0030098-g002" target="_blank">Figure 2<i>c</i></a>), the second tumor arraylet (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone-0030098-g002" target="_blank">Figure 2<i>a</i></a>) exhibits an unsegmented X chromosome copy-number distribution, that is approximately centered at zero with a relatively small width.</p
Survival analyses of the three sets of patients classified by GSVD, age at diagnosis or both.
<p>(<i>a</i>) Kaplan-Meier (KM) <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Kaplan1" target="_blank">[36]</a> curves for the 247 patients with TCGA annotations in the initial set of 251 patients, classified by copy numbers in the second probelet, which is computed by GSVD for the 251 patients, show a median survival time difference of 16 months, with the corresponding log-rank test <i>P</i>-value . The univariate Cox <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Cox1" target="_blank">[37]</a> proportional hazard ratio is 2.3, with a <i>P</i>-value (Table S1), meaning that high relative copy numbers in the second probelet confer more than twice the hazard of low numbers. The <i>P</i>-values are calculated without adjusting for multiple comparisons <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Rothman1" target="_blank">[38]</a>. (<i>b</i>) Survival analyses of the 247 patients classified by age, i.e., 50 or 50 years old at diagnosis, show that the prognostic contribution of age, with a KM median survival time difference of 11 months and a univariate Cox hazard ratio of 2, is comparable to that of GSVD. (<i>c</i>) Survival analyses of the 247 patients classified by both GSVD and age, show similar multivariate Cox hazard ratios, of 1.8 and 1.7, that do not differ significantly from the corresponding univariate hazard ratios, of 2.3 and 2, respectively. This means that GSVD and age are independent prognostic predictors. With a KM median survival time difference of 22 months, GSVD and age combined make a better predictor than age alone. (<i>d</i>) Survival analyses of the 334 patients with TCGA annotations and a GSVD classification in the inclusive confirmation set of 344 patients, classified by copy numbers in the second probelet, which is computed by GSVD for the 344 patients, show a KM median survival time difference of 16 months and a univariate hazard ratio of 2.4, and confirm the survival analyses of the initial set of 251 patients. (<i>e</i>) Survival analyses of the 334 patients classified by age confirm that the prognostic contribution of age, with a KM median survival time difference of 10 months and a univariate hazard ratio of 2, is comparable to that of GSVD. (<i>f</i>) Survival analyses of the 334 patients classified by both GSVD and age, show similar multivariate Cox hazard ratios, of 1.9 and 1.8, that do not differ significantly from the corresponding univariate hazard ratios, and a KM median survival time difference of 22 months, with the corresponding log-rank test <i>P</i>-value . This confirms that the prognostic contribution of GSVD is independent of age, and that combined with age, GSVD makes a better predictor than age alone. (<i>g</i>) Survival analyses of the 183 patients with a GSVD classification in the independent validation set of 184 patients, classified by correlations of each patient's GBM profile with the second tumor arraylet, which is computed by GSVD for the 251 patients, show a KM median survival time difference of 12 months and a univariate hazard ratio of 2.9, and validate the survival analyses of the initial set of 251 patients. (<i>h</i>) Survival analyses of the 183 patients classified by age validate that the prognostic contribution of age is comparable to that of GSVD. (<i>i</i>) Survival analyses of the 183 patients classified by both GSVD and age, show similar multivariate Cox hazard ratios, of 2 and 2.2, and a KM median survival time difference of 41 months, with the corresponding log-rank test <i>P</i>-value . This validates that the prognostic contribution of GSVD is independent of age, and that combined with age, GSVD makes a better predictor than age alone, also for patients with measured GBM aCGH profiles in the absence of matched normal profiles.</p
Generalized singular value decomposition (GSVD) of the TCGA patient-matched tumor and normal aCGH profiles.
<p>The structure of the patient-matched but probe-independent tumor and normal datasets and , of the initial set of = 251 patients, i.e., -arrays = 212,696-tumor probes and = 211,227-normal probes, is of an order higher than that of a single matrix. The patients, the tumor and normal probes as well as the tissue types, each represent a degree of freedom. Unfolded into a single matrix, some of the degrees of freedom are lost and much of the information in the datasets might also be lost. The GSVD simultaneously separates the paired datasets into paired weighted sums of outer products of two patterns each: One pattern of copy-number variation across the patients, i.e., a “probelet” , which is identical for both the tumor and normal datasets, combined with either the corresponding tumor-specific pattern of copy-number variation across the tumor probes, i.e., the “tumor arraylet” , or the corresponding normal-specific pattern across the normal probes, i.e., the “normal arraylet” (Equation 1). This is depicted in a raster display, with relative copy-number gain (red), no change (black) and loss (green), explicitly showing only the first though the 10th and the 242nd through the 251st probelets and corresponding tumor and normal arraylets, which capture 52% and 71% of the information in the tumor and normal dataset, respectively. The significance of the probelet in the tumor dataset relative to its significance in the normal dataset is defined in terms of an “angular distance” that is proportional to the ratio of these weights (Equation 4). This is depicted in a bar chart display, showing that the first and second probelets are almost exclusive to the tumor dataset with angular distances 2/9, the 247th to 251st probelets are approximately exclusive to the normal dataset with angular distances , and the 246th probelet is relatively common to the normal and tumor datasets with an angular distance . We find and confirm that the second most tumor-exclusive probelet, which is also the most significant probelet in the tumor dataset, significantly correlates with GBM prognosis. The corresponding tumor arraylet describes a global pattern of tumor-exclusive co-occurring CNAs, including most known GBM-associated changes in chromosome numbers and focal CNAs, as well as several previously unreported CNAs, including the biochemically putative drug target-encoding <i>TLK2</i> <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Heidenblad1" target="_blank">[22]</a>–<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Sillj1" target="_blank">[25]</a>. We find and validate that a negligible weight of the global pattern in a patient's GBM aCGH profile is indicative of a significantly longer GBM survival time. It was shown that the GSVD provides a mathematical framework for comparative modeling of DNA microarray data from two organisms <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Alter1" target="_blank">[12]</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Alter2" target="_blank">[39]</a>. Recent experimental results <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Omberg1" target="_blank">[40]</a> verify a computationally predicted genome-wide mode of regulation <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Alter3" target="_blank">[41]</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0030098#pone.0030098-Omberg2" target="_blank">[42]</a>, and demonstrate that GSVD modeling of DNA microarray data can be used to correctly predict previously unknown cellular mechanisms. This GSVD comparative modeling of aCGH data from patient-matched tumor and normal samples, therefore, draws a mathematical analogy between the prediction of cellular modes of regulation and the prognosis of cancers.</p
Enrichment of the significant probelets in TCGA annotations.
<p>Enrichment of the significant probelets in TCGA annotations.</p
