253 research outputs found

    A Variant in a MicroRNA complementary site in the 3' UTR of the KIT oncogene increases risk of acral melanoma.

    Get PDF
    MicroRNAs (miRNAs) are small ∼22nt single stranded RNAs that negatively regulate protein expression by binding to partially complementary sequences in the 3' untranslated region (3' UTRs) of target gene messenger RNAs (mRNA). Recently, mutations have been identified in both miRNAs and target genes that disrupt regulatory relationships, contribute to oncogenesis and serve as biomarkers for cancer risk. KIT, an established oncogene with a multifaceted role in melanogenesis and melanoma pathogenesis, has recently been shown to be upregulated in some melanomas, and is also a target of the miRNA miR-221. Here, we describe a genetic variant in the 3' UTR of the KIT oncogene that correlates with a greater than fourfold increased risk of acral melanoma. This KIT variant results in a mismatch in the seed region of a miR-221 complementary site and reporter data suggests that this mismatch can result in increased expression of the KIT oncogene. Consistent with the hypothesis that this is a functional variant, KIT mRNA and protein levels are both increased in the majority of samples harboring the KIT variant. This work identifies a novel genetic marker for increased heritable risk of melanoma

    Criteria for the use of omics-based predictors in clinical trials.

    Get PDF
    The US National Cancer Institute (NCI), in collaboration with scientists representing multiple areas of expertise relevant to 'omics'-based test development, has developed a checklist of criteria that can be used to determine the readiness of omics-based tests for guiding patient care in clinical trials. The checklist criteria cover issues relating to specimens, assays, mathematical modelling, clinical trial design, and ethical, legal and regulatory aspects. Funding bodies and journals are encouraged to consider the checklist, which they may find useful for assessing study quality and evidence strength. The checklist will be used to evaluate proposals for NCI-sponsored clinical trials in which omics tests will be used to guide therapy

    Batch effect confounding leads to strong bias in performance estimates obtained by cross-validation.

    Get PDF
    BACKGROUND: With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences ("batch effects") as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. FOCUS: The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. DATA: We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., 'control') or group 2 (e.g., 'treated'). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. METHODS: We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data

    Lectin-like bacteriocins from pseudomonas spp. utilise D-rhamnose containing lipopolysaccharide as a cellular receptor

    Get PDF
    Lectin-like bacteriocins consist of tandem monocot mannose-binding domains and display a genus-specific killing activity. Here we show that pyocin L1, a novel member of this family from Pseudomonas aeruginosa, targets susceptible strains of this species through recognition of the common polysaccharide antigen (CPA) of P. aeruginosa lipopolysaccharide that is predominantly a homopolymer of d-rhamnose. Structural and biophysical analyses show that recognition of CPA occurs through the C-terminal carbohydrate-binding domain of pyocin L1 and that this interaction is a prerequisite for bactericidal activity. Further to this, we show that the previously described lectin-like bacteriocin putidacin L1 shows a similar carbohydrate-binding specificity, indicating that oligosaccharides containing d-rhamnose and not d-mannose, as was previously thought, are the physiologically relevant ligands for this group of bacteriocins. The widespread inclusion of d-rhamnose in the lipopolysaccharide of members of the genus Pseudomonas explains the unusual genus-specific activity of the lectin-like bacteriocins

    Predictive value of S100-B and copeptin for outcomes following seizure: the BISTRO International Cohort Study.

    Get PDF
    OBJECTIVE: To evaluate the performance of S100-B protein and copeptin, in addition to clinical variables, in predicting outcomes of patients attending the emergency department (ED) following a seizure. METHODS: We prospectively included adult patients presented with an acute seizure, in four EDs in France and the United Kingdom. Participants were followed up for 28 days. The primary endpoint was a composite of seizure recurrence, all-cause mortality, hospitalization or rehospitalisation, or return visit in the ED within seven days. RESULTS: Among the 389 participants included in the analysis, 156 (40%) experienced the primary endpoint within seven days and 195 (54%) at 28 days. Mean levels of both S100-B (0.11 μg/l [95% CI 0.07-0.20] vs 0.09 μg/l [0.07-0.14]) and copeptin (23 pmol/l [9-104] vs 17 pmol/l [8-43]) were higher in participants meeting the primary endpoint. However, both biomarkers were poorly predictive of the primary outcome with a respective area under the receiving operator characteristic curve of 0.57 [0.51-0.64] and 0.59 [0.54-0.64]. Multivariable logistic regression analysis identified higher age (odds ratio [OR] 1.3 per decade [1.1-1.5]), provoked seizure (OR 4.93 [2.5-9.8]), complex partial seizure (OR 4.09 [1.8-9.1]) and first seizure (OR 1.83 [1.1-3.0]) as independent predictors of the primary outcome. A second regression analysis including the biomarkers showed no additional predictive benefit (S100-B OR 3.89 [0.80-18.9] copeptin OR 1 [1.00-1.00]). CONCLUSION: The plasma biomarkers S100-B and copeptin did not improve prediction of poor outcome following seizure. Higher age, a first seizure, a provoked seizure and a partial complex seizure are independently associated with adverse outcomes

    Optimally splitting cases for training and testing high dimensional classifiers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We consider the problem of designing a study to develop a predictive classifier from high dimensional data. A common study design is to split the sample into a training set and an independent test set, where the former is used to develop the classifier and the latter to evaluate its performance. In this paper we address the question of what proportion of the samples should be devoted to the training set. How does this proportion impact the mean squared error (MSE) of the prediction accuracy estimate?</p> <p>Results</p> <p>We develop a non-parametric algorithm for determining an optimal splitting proportion that can be applied with a specific dataset and classifier algorithm. We also perform a broad simulation study for the purpose of better understanding the factors that determine the best split proportions and to evaluate commonly used splitting strategies (1/2 training or 2/3 training) under a wide variety of conditions. These methods are based on a decomposition of the MSE into three intuitive component parts.</p> <p>Conclusions</p> <p>By applying these approaches to a number of synthetic and real microarray datasets we show that for linear classifiers the optimal proportion depends on the overall number of samples available and the degree of differential expression between the classes. The optimal proportion was found to depend on the full dataset size (n) and classification accuracy - with higher accuracy and smaller <it>n </it>resulting in more assigned to the training set. The commonly used strategy of allocating 2/3rd of cases for training was close to optimal for reasonable sized datasets (<it>n </it>≥ 100) with strong signals (i.e. 85% or greater full dataset accuracy). In general, we recommend use of our nonparametric resampling approach for determing the optimal split. This approach can be applied to any dataset, using any predictor development method, to determine the best split.</p

    Surgical management and outcome of newly diagnosed glioblastoma without contrast enhancement (<i>low-grade appearance</i>):a report of the RANO <i>resect </i>group

    Get PDF
    BackgroundResection of the contrast-enhancing (CE) tumor represents the standard of care in newly diagnosed glioblastoma. However, some tumors ultimately diagnosed as glioblastoma lack contrast enhancement and have a ‘low-grade appearance’ on imaging (non-CE glioblastoma). We aimed to (a) volumetrically define the value of non-CE tumor resection in the absence of contrast enhancement, and to (b) delineate outcome differences between glioblastoma patients with and without contrast enhancement.MethodsThe RANO resect group retrospectively compiled a global, eight-center cohort of patients with newly diagnosed glioblastoma per WHO 2021 classification. The associations between postoperative tumor volumes and outcome were analyzed. Propensity score-matched analyses were constructed to compare glioblastomas with and without contrast enhancement.ResultsAmong 1323 newly diagnosed IDH-wildtype glioblastomas, we identified 98 patients (7.4%) without contrast enhancement. In such patients, smaller postoperative tumor volumes were associated with more favorable outcome. There was an exponential increase in risk for death with larger residual non-CE tumor. Accordingly, extensive resection was associated with improved survival compared to lesion biopsy. These findings were retained on a multivariable analysis adjusting for demographic and clinical markers. Compared to CE glioblastoma, patients with non-CE glioblastoma had a more favorable clinical profile and superior outcome as confirmed in propensity score analyses by matching the patients with non-CE glioblastoma to patients with CE glioblastoma using a large set of clinical variables.ConclusionsThe absence of contrast enhancement characterizes a less aggressive clinical phenotype of IDH-wildtype glioblastomas. Maximal resection of non-CE tumors has prognostic implications and translates into favorable outcome

    Proteomic Basis of the Antibody Response to Monkeypox Virus Infection Examined in Cynomolgus Macaques and a Comparison to Human Smallpox Vaccination

    Get PDF
    Monkeypox is a zoonotic viral disease that occurs primarily in Central and West Africa. A recent outbreak in the United States heightened public health concerns for susceptible human populations. Vaccinating with vaccinia virus to prevent smallpox is also effective for monkeypox due to a high degree of sequence conservation. Yet, the identity of antigens within the monkeypox virus proteome contributing to immune responses has not been described in detail. We compared antibody responses to monkeypox virus infection and human smallpox vaccination by using a protein microarray covering 92–95% (166–192 proteins) of representative proteomes from monkeypox viral clades of Central and West Africa, including 92% coverage (250 proteins) of the vaccinia virus proteome as a reference orthopox vaccine. All viral gene clones were verified by sequencing and purified recombinant proteins were used to construct the microarray. Serum IgG of cynomolgus macaques that recovered from monkeypox recognized at least 23 separate proteins within the orthopox proteome, while only 14 of these proteins were recognized by IgG from vaccinated humans. There were 12 of 14 antigens detected by sera of human vaccinees that were also recognized by IgG from convalescent macaques. The greatest level of IgG binding for macaques occurred with the structural proteins F13L and A33R, and the membrane scaffold protein D13L. Significant IgM responses directed towards A44R, F13L and A33R of monkeypox virus were detected before onset of clinical symptoms in macaques. Thus, antibodies from vaccination recognized a small number of proteins shared with pathogenic virus strains, while recovery from infection also involved humoral responses to antigens uniquely recognized within the monkeypox virus proteome

    A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Bioactivity profiling using high-throughput <it>in vitro </it>assays can reduce the cost and time required for toxicological screening of environmental chemicals and can also reduce the need for animal testing. Several public efforts are aimed at discovering patterns or classifiers in high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. Supervised machine learning is a powerful approach to discover combinatorial relationships in complex <it>in vitro/in vivo </it>datasets. We present a novel model to simulate complex chemical-toxicology data sets and use this model to evaluate the relative performance of different machine learning (ML) methods.</p> <p>Results</p> <p>The classification performance of Artificial Neural Networks (ANN), K-Nearest Neighbors (KNN), Linear Discriminant Analysis (LDA), Naïve Bayes (NB), Recursive Partitioning and Regression Trees (RPART), and Support Vector Machines (SVM) in the presence and absence of filter-based feature selection was analyzed using K-way cross-validation testing and independent validation on simulated <it>in vitro </it>assay data sets with varying levels of model complexity, number of irrelevant features and measurement noise. While the prediction accuracy of all ML methods decreased as non-causal (irrelevant) features were added, some ML methods performed better than others. In the limit of using a large number of features, ANN and SVM were always in the top performing set of methods while RPART and KNN (k = 5) were always in the poorest performing set. The addition of measurement noise and irrelevant features decreased the classification accuracy of all ML methods, with LDA suffering the greatest performance degradation. LDA performance is especially sensitive to the use of feature selection. Filter-based feature selection generally improved performance, most strikingly for LDA.</p> <p>Conclusion</p> <p>We have developed a novel simulation model to evaluate machine learning methods for the analysis of data sets in which in vitro bioassay data is being used to predict in vivo chemical toxicology. From our analysis, we can recommend that several ML methods, most notably SVM and ANN, are good candidates for use in real world applications in this area.</p

    Complement C1 Esterase Inhibitor Levels Linked to Infections and Contaminated Heparin-Associated Adverse Events

    Get PDF
    Activation of kinin-kallikrein and complement pathways by oversulfated-chondroitin-sulfate (OSCS) has been linked with recent heparin-associated adverse clinical events. Given the fact that the majority of patients who received contaminated heparin did not experience an adverse event, it is of particular importance to determine the circumstances that increase the risk of a clinical reaction. In this study, we demonstrated by both the addition and affinity depletion of C1inh from normal human plasma, that the level of C1inh in the plasma has a great impact on the OSCS-induced kallikrein activity and its kinetics. OSCS-induced kallikrein activity was dramatically increased after C1inh was depleted, while the addition of C1inh completely attenuated kallikrein activity. In addition, actual clinical infection can lead to increased C1inh levels. Plasma from patients with sepsis had higher average levels of functional C1inh and decreased OSCS-induced kallikrein activity. Lastly, descriptive data on adverse event reports suggest cases likely to be associated with contaminated heparin are inversely correlated with infection. Our data suggest that low C1inh levels can be a risk factor and high levels can be protective. The identification of risk factors for contact system-mediated adverse events may allow for patient screening and clinical development of prophylaxis and treatments
    corecore