1,532 research outputs found

    Quantifying Model Complexity via Functional Decomposition for Better Post-Hoc Interpretability

    Full text link
    Post-hoc model-agnostic interpretation methods such as partial dependence plots can be employed to interpret complex machine learning models. While these interpretation methods can be applied regardless of model complexity, they can produce misleading and verbose results if the model is too complex, especially w.r.t. feature interactions. To quantify the complexity of arbitrary machine learning models, we propose model-agnostic complexity measures based on functional decomposition: number of features used, interaction strength and main effect complexity. We show that post-hoc interpretation of models that minimize the three measures is more reliable and compact. Furthermore, we demonstrate the application of these measures in a multi-objective optimization approach which simultaneously minimizes loss and complexity

    Serosurvey of selected avian pathogens in brazilian commercial Rheas (Rhea americana) and Ostriches (Struthio camelus)

    Get PDF
    Ratite farming of has expanded worldwide. Due to the intensive farming methods used by ratite producers, preventive medicine practices should be established. In this context, the surveillance and control of some avian pathogens are essential for the success of the ratite industry; however, little is known on the health status of ratites in Brazil. Therefore, the prevalence of antibodies against Newcastle Disease virus, Chlamydophila psittaci, Mycoplasma gallisepticum, Mycoplasma synoviae, and Salmonella Pullorum were evaluated in 100 serum samples collected from commercial ostriches and in 80 serum samples from commercial rheas reared in Brazil. All sampled animals were clinically healthy. The results showed that all ostriches and rheas were serologically negative to Newcastle disease virus, Chlamydophila psittaci, Mycoplasma gallisepticum, and Mycoplasma synoviae. Positive antibody responses against Salmonella Pullorum antigen were not detected in ostrich sera, but were detected in two rhea serum samples. These results can be considered as a warning as to the presence of Salmonella spp. in ratite farms. Therefore, the implementation of good health management and surveillance programs in ratite farms may contribute to improve not only animal production, but also public health conditions.Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Conselho Nacional de Desenvolvi-mento Científico e Tecnológico (CNPq

    Predicting volume of distribution with decision tree-based regression methods using predicted tissue:plasma partition coefficients

    Get PDF
    Background: Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Results: Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Conclusions: Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Figure not available: see fulltext. © 2015 Freitas et al.; licensee Springer

    Learning Interpretable Rules for Multi-label Classification

    Full text link
    Multi-label classification (MLC) is a supervised learning problem in which, contrary to standard multiclass classification, an instance can be associated with several class labels simultaneously. In this chapter, we advocate a rule-based approach to multi-label classification. Rule learning algorithms are often employed when one is not only interested in accurate predictions, but also requires an interpretable theory that can be understood, analyzed, and qualitatively evaluated by domain experts. Ideally, by revealing patterns and regularities contained in the data, a rule-based theory yields new insights in the application domain. Recently, several authors have started to investigate how rule-based models can be used for modeling multi-label data. Discussing this task in detail, we highlight some of the problems that make rule learning considerably more challenging for MLC than for conventional classification. While mainly focusing on our own previous work, we also provide a short overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models in Computer Vision and Machine Learning. The Springer Series on Challenges in Machine Learning. Springer (2018). See http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further informatio

    Historical analysis of the Brazilian cervical cancer screening program from 2006 to 2013: a time for reflection

    Get PDF
    BACKGROUND: The Cervical Cancer Database of the Brazilian National Health Service (SISCOLO) contains information regarding all cervical cytological tests and, if properly explored, can be used as a tool for monitoring and managing the cervical cancer screening program. The aim of this study was to perform a historical analysis of the cervical cancer screening program in Brazil from 2006 to 2013. MATERIAL AND METHODS: The data necessary to calculate quality indicators were obtained from the SISCOLO, a Brazilian health system tool. Joinpoint analysis was used to calculate the annual percentage change. RESULTS: We observed important trends showing decreased rates of low-grade squamous intraepithelial lesions (LSIL) and high-grade squamous intraepithelial lesions (HSIL) and an increased rate of rejected exams from 2009 to 2013. The index of positivity was maintained at levels below those indicated by international standards; very low frequencies of unsatisfactory cases were observed over the study period, which partially contradicts the low rate of positive cases. The number of positive cytological diagnoses was below that expected, considering that developed countries with low frequencies of cervical cancer detect more lesions annually. CONCLUSIONS: The evolution of indicators from 2006 to 2013 suggests that actions must be taken to improve the effectiveness of cervical cancer control in Brazil

    Astrobiological Complexity with Probabilistic Cellular Automata

    Full text link
    Search for extraterrestrial life and intelligence constitutes one of the major endeavors in science, but has yet been quantitatively modeled only rarely and in a cursory and superficial fashion. We argue that probabilistic cellular automata (PCA) represent the best quantitative framework for modeling astrobiological history of the Milky Way and its Galactic Habitable Zone. The relevant astrobiological parameters are to be modeled as the elements of the input probability matrix for the PCA kernel. With the underlying simplicity of the cellular automata constructs, this approach enables a quick analysis of large and ambiguous input parameters' space. We perform a simple clustering analysis of typical astrobiological histories and discuss the relevant boundary conditions of practical importance for planning and guiding actual empirical astrobiological and SETI projects. In addition to showing how the present framework is adaptable to more complex situations and updated observational databases from current and near-future space missions, we demonstrate how numerical results could offer a cautious rationale for continuation of practical SETI searches.Comment: 37 pages, 11 figures, 2 tables; added journal reference belo

    Strong interface-induced spin-orbit coupling in graphene on WS2

    Get PDF
    Interfacial interactions allow the electronic properties of graphene to be modified, as recently demonstrated by the appearance of satellite Dirac cones in the band structure of graphene on hexagonal boron nitride (hBN) substrates. Ongoing research strives to explore interfacial interactions in a broader class of materials in order to engineer targeted electronic properties. Here we show that at an interface with a tungsten disulfide (WS2) substrate, the strength of the spin-orbit interaction (SOI) in graphene is very strongly enhanced. The induced SOI leads to a pronounced low-temperature weak anti-localization (WAL) effect, from which we determine the spin-relaxation time. We find that spin-relaxation time in graphene is two-to-three orders of magnitude smaller on WS2 than on SiO2 or hBN, and that it is comparable to the intervalley scattering time. To interpret our findings we have performed first-principle electronic structure calculations, which both confirm that carriers in graphene-on-WS2 experience a strong SOI and allow us to extract a spin-dependent low-energy effective Hamiltonian. Our analysis further shows that the use of WS2 substrates opens a possible new route to access topological states of matter in graphene-based systems.Comment: Originally submitted version in compliance with editorial guidelines. Final version with expanded discussion of the relation between theory and experiments to be published in Nature Communication

    An ant colony-based semi-supervised approach for learning classification rules

    Get PDF
    Semi-supervised learning methods create models from a few labeled instances and a great number of unlabeled instances. They appear as a good option in scenarios where there is a lot of unlabeled data and the process of labeling instances is expensive, such as those where most Web applications stand. This paper proposes a semi-supervised self-training algorithm called Ant-Labeler. Self-training algorithms take advantage of supervised learning algorithms to iteratively learn a model from the labeled instances and then use this model to classify unlabeled instances. The instances that receive labels with high confidence are moved from the unlabeled to the labeled set, and this process is repeated until a stopping criteria is met, such as labeling all unlabeled instances. Ant-Labeler uses an ACO algorithm as the supervised learning method in the self-training procedure to generate interpretable rule-based models—used as an ensemble to ensure accurate predictions. The pheromone matrix is reused across different executions of the ACO algorithm to avoid rebuilding the models from scratch every time the labeled set is updated. Results showed that the proposed algorithm obtains better predictive accuracy than three state-of-the-art algorithms in roughly half of the datasets on which it was tested, and the smaller the number of labeled instances, the better the Ant-Labeler performance

    Effect of synbiotic supplementation in children and adolescents with cystic fibrosis: a randomized controlled clinical trial

    Get PDF
    BACKGROUND/OBJECTIVES:Cystic fibrosis (CF) is characterized by excessive activation of immune processes. The aim of this study was to evaluate the effect of synbiotic supplementation on the inflammatory response in children/adolescents with CF. SUBJECTS/METHODS:A randomized, placebo-controlled, double-blind, clinical-trial was conducted with control group (CG, n = 17), placebo-CF-group (PCFG, n = 19), synbiotic CF-group (SCFG, n = 22), PCFG negative (n = 8) and positive (n = 11) bacteriology, and SCFG negative (n = 12) and positive (n = 10) bacteriology. Markers of lung function (FEV1), nutritional status [body mass index-for age (BMI/A), height-for-age (H/A), weight-for-age (W/A), upper-arm fat area (UFA), upper-arm muscle area (UMA), body fat (%BF)], and inflammation [interleukin (IL)-12, tumor necrosis factor-alpha (TNF-α), IL-10, IL-6, IL-1β, IL-8, myeloperoxidase (MPO), nitric oxide metabolites (NOx)] were evaluated before and after 90-day of supplementation with a synbiotic. RESULTS:No significance difference was found between the baseline and end evaluations of FEV1 and nutricional status markers. A significant interaction (time vs. group) was found for IL-12 (p = 0.010) and myeloperoxidase (p = 0.036) between PCFG and SCFG, however, the difference was not maintained after assessing the groups individually. NOx diminished significantly after supplementation in the SCFG (p = 0.030). In the SCFG with positive bacteriology, reductions were found in IL-6 (p = 0.033) and IL-8 (p = 0.009) after supplementation. CONCLUSIONS: Synbiotic supplementation shown promise at diminishing the pro-inflammatory markers IL-6, IL-8 in the SCFG with positive bacteriology and NOx in the SCFG in children/adolescents with CF
    corecore