155 research outputs found

    Machine Learning for Cancer Survival Prediction

    Get PDF
    Cancer is a leading cause of death worldwide and the second leading cause of death in Germany. The primary goal of cancer therapy is to reduce mortality and improve patient survival. However, the choice of therapy is heavily influenced by the patient’s prognosis, highlighting the importance of cancer survival prediction as a means to quantify the patient’s risk and estimate prognosis. This dissertation presents a cancer survival prediction approach that uses XGBoost tree ensemble learning and is based on gene expression data of 25 different cancer types from The Cancer Genome Atlas (TCGA). We evaluate two versions of this approach, one trained on each cancer type separately and the other trained on pan-cancer data comprising all 25 cancer types, and find that the pan-cancer approach yields improved performance over the single-cancer approach. Furthermore, we evaluate the pan-cancer approach on additional molecular data types, including mutations, copy number variations, and protein expression data, and identify gene expression as the most informative data type. To assess the biological plausibility of the gene expression-based pan-cancer survival prediction approach, we apply network propagation to gene weights derived from the survival prediction model and infer a pan-cancer survival network comprising 103 genes. These 103 genes are most significantly enriched for the tumor microenvironment, which has been associated with cancer progression, metastasis, and response to therapy, validating the biological plausibility of our survival prediction approach. Furthermore, we explore the potential of transfer learning for cancer survival prediction. To this end, we pre-train neural networks for cancer survival prediction, but also for related tasks such as tissue type and age prediction. We then transfer the learned knowledge to cancer survival prediction on independent datasets from TCGA, as well as substantially smaller cancer studies. We find that transfer learning can indeed improve cancer survival prediction, although the benefit of transfer learning may depend on the size and characteristics of the datasets used

    A gradient tree boosting and network propagation derived pan-cancer survival network of the tumor microenvironment

    Get PDF
    Predicting cancer survival from molecular data is an important aspect of biomedical research because it allows quantifying patient risks and thus individualizing therapy. We introduce XGBoost tree ensemble learning to predict survival from transcriptome data of 8,024 patients from 25 different cancer types and show highly competitive performance with state-of-the-art methods. To further improve plausibility of the machine learning approach we conducted two additional steps. In the first step, we applied pan-cancer training and showed that it substantially improves prognosis compared with cancer subtype-specific training. In the second step, we applied network propagation and inferred a pan-cancer survival network consisting of 103 genes. This network highlights cross-cohort features and is predictive for the tumor microenvironment and immune status of the patients. Our work demonstrates that pan-cancer learning combined with network propagation generalizes over multiple cancer types and identifies biologically plausible features that can serve as biomarkers for monitoring cancer survival

    Gradient tree boosting and network propagation for the identification of pan-cancer survival networks

    Get PDF
    Cancer survival prediction is typically done with uninterpretable machine learning techniques, e.g., gradient tree boosting. Therefore, additional steps are needed to infer biological plausibility of the predictions. Here, we describe a protocol that combines pan-cancer survival prediction with XGBoost tree- ensemble learning and subsequent propagation of the learned feature weights on protein interaction networks. This protocol is based on TCGA transcriptome data of 8,024 patients from 25 cancer types but can easily be adapted to cancer patient data from other sources. For complete details on the use and execution of this protocol, please refer to Thedinga and Herwig (2022)

    Ovarian Cancer: Recommendations for research funding in detection, treatment and prevention

    Get PDF
    This year over 230,000 women worldwide will be diagnosed with ovarian cancer, and 150,000 will die from this disease, making ovarian cancer the most lethal gynecologic cancer. While many cancers like lung, breast and prostate cancer have enjoyed amazing progress in the past 50 years, ovarian cancer mortality has remained high due to late detection, lack of novel treatment approaches and lack of prevention options. With the advent of genetic testing, we have an opportunity to reduce mortality by better understanding ovarian cancer contributing factors and developing prevention or early detection measures. Effective screening programs are desperately needed to identify cancer cases before late stage advancement. Lastly, many improved treatment options are currently in clinical trials, including approaches to engage the immune system to better reduce cancerous cell growth. Improving ovarian cancer survival rates will require a shift in how research funds are allocated, shifting government funds from treatment toward researching effective preventing and screening measures.Master of Public Healt

    Forecasting resource requirements for drug development long range planning

    Get PDF
    Thesis (M.B.A.)--Massachusetts Institute of Technology, Sloan School of Management; and, (S.M.)--Massachusetts Institute of Technology, Dept. of Chemical Engineering; in conjunction with the Leaders for Manufacturing Program at MIT, 2010."June 2010." Cataloged from PDF version of thesis.Includes bibliographical references (p. 64).This thesis investigates the use of a task-based Monte Carlo simulation model to forecast headcount and manufacturing capacity requirements for a drug development organization. A pharmaceutical drug development group is responsible for designing the manufacturing process for new potential drug products, testing the product quality, and supplying product for clinical trials. The drug development process is complex and uncertain. The speed to market is critical to a company's success. Therefore, it is important to have an adequate number of employees and available manufacturing capacity to support timely and efficient drug development. The employees and manufacturing capacity can either be supplied internally or externally, through contract manufacturing organizations. This thesis formulates and empirically evaluates a simulation model designed using the Novartis Biologics drug development process and is adaptable to other pharmaceutical organization. The model demonstrates 7% accuracy when compared with historical data, and estimates within 13% of the currently accepted manufacturing capacity forecasting tool. Additionally, three case studies are included to demonstrate how the model can be used to evaluate strategic decisions. The case studies include: a drug development process improvement evaluation, an outsourcing evaluation, and an "at risk" development evaluation.by Angela Thedinga.S.M.M.B.A

    Cancer drug sensitivity estimation using modular deep Graph Neural Networks

    Get PDF
    Computational drug sensitivity models have the potential to improve therapeutic outcomes by identifying targeted drugs components that are tailored to the transcriptomic profile of a given primary tumor. The SMILES representation of molecules that is used by state-of-the-art drug-sensitivity models is not conducive for neural networks to generalize to new drugs, in part because the distance between atoms does not generally correspond to the distance between their representation in the SMILES strings. Graph-attention networks, on the other hand, are high-capacity models that require large training-data volumes which are not available for drug-sensitivity estimation. We develop a modular drug-sensitivity graph-attentional neural network. The modular architecture allows us to separately pre-train the graph encoder and graph-attentional pooling layer on related tasks for which more data are available. We observe that this model outperforms reference models for the use cases of precision oncology and drug discovery; in particular, it is better able to predict the specific interaction between drug and cell line that is not explained by the general cytotoxicity of the drug and the overall survivability of the cell line. The complete source code is available at https://zenodo.org/doi/10.5281/zenodo.8020945. All experiments are based on the publicly available GDSC data

    Matching anticancer compounds and tumor cell lines by neural networks with ranking loss

    Get PDF
    Computational drug sensitivity models have the potential to improve therapeutic outcomes by identifying targeted drug components that are likely to achieve the highest efficacy for a cancer cell line at hand at a therapeutic dose. State of the art drug sensitivity models use regression techniques to predict the inhibitory concentration of a drug for a tumor cell line. This regression objective is not directly aligned with either of these principal goals of drug sensitivity models: We argue that drug sensitivity modeling should be seen as a ranking problem with an optimization criterion that quantifies a drug’s inhibitory capacity for the cancer cell line at hand relative to its toxicity for healthy cells. We derive an extension to the well-established drug sensitivity regression model PaccMann that employs a ranking loss and focuses on the ratio of inhibitory concentration and therapeutic dosage range. We find that the ranking extension significantly enhances the model’s capability to identify the most effective anticancer drugs for unseen tumor cell profiles based in on in-vitro data

    Pre-Training on In Vitro and Fine-Tuning on Patient-Derived Data Improves Deep Neural Networks for Anti-Cancer Drug-Sensitivity Prediction

    Get PDF
    Large-scale databases that report the inhibitory capacities of many combinations of candidate drug compounds and cultivated cancer cell lines have driven the development of preclinical drug-sensitivity models based on machine learning. However, cultivated cell lines have devolved from human cancer cells over years or even decades under selective pressure in culture conditions. Moreover, models that have been trained on in vitro data cannot account for interactions with other types of cells. Drug-response data that are based on patient-derived cell cultures, xenografts, and organoids, on the other hand, are not available in the quantities that are needed to train high-capacity machine-learning models. We found that pre-training deep neural network models of drug sensitivity on in vitro drug-sensitivity databases before fine-tuning the model parameters on patient-derived data improves the models’ accuracy and improves the biological plausibility of the features, compared to training only on patient-derived data. From our experiments, we can conclude that pre-trained models outperform models that have been trained on the target domains in the vast majority of cases

    Weight Stigma Experiences and Physical (In)activity: A Biographical Analysis

    Get PDF
    Introduction: People with obesity often report experiences of weight-related discrimination. In order to find out how such experiences throughout the life course are related to physical activity behavior, we exploratively studied activity-related biographies of people with obesity from a social constructivist perspective. Methods: We collected biographical data of 30 adults (mean age 37.66 years; 14 males and 16 females) with obesity (average BMI 40.64, including a range from 33 to 58) using a biography visualization tool that allows participants to map developmental courses and critical life experiences over their life course. Results: Participants remembered a continuous decrease of physical activity from childhood to mid-adulthood. Weight-related discrimination, both in sport and non-sport settings, was especially experienced in adolescence and mid-adulthood. Against the background of our findings, we assume that the degree of felt stigma rather than the stigmatizing behavior itself influences physical activity behavior over the life course. Conclusion: The results of our exploratory study reiterate the detrimental effect weight stigma can have on health behaviors. Initiatives are needed to reduce weight stigma in exercise contexts; additionally, initiatives to promote physical activity should focus on helping individuals with obesity to establish coping strategies to reduce the experienced burden from weight stigma
    corecore