2,108 research outputs found

    Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions

    Full text link
    Deep neural networks are widely used for classification. These deep models often suffer from a lack of interpretability -- they are particularly difficult to understand because of their non-linear nature. As a result, neural networks are often treated as "black box" models, and in the past, have been trained purely to optimize the accuracy of predictions. In this work, we create a novel network architecture for deep learning that naturally explains its own reasoning for each prediction. This architecture contains an autoencoder and a special prototype layer, where each unit of that layer stores a weight vector that resembles an encoded training input. The encoder of the autoencoder allows us to do comparisons within the latent space, while the decoder allows us to visualize the learned prototypes. The training objective has four terms: an accuracy term, a term that encourages every prototype to be similar to at least one encoded input, a term that encourages every encoded input to be close to at least one prototype, and a term that encourages faithful reconstruction by the autoencoder. The distances computed in the prototype layer are used as part of the classification process. Since the prototypes are learned during training, the learned network naturally comes with explanations for each prediction, and the explanations are loyal to what the network actually computes.Comment: The first two authors contributed equally, 8 pages, accepted in AAAI 201

    Perron-based algorithms for the multilinear pagerank

    Get PDF
    We consider the multilinear pagerank problem studied in [Gleich, Lim and Yu, Multilinear Pagerank, 2015], which is a system of quadratic equations with stochasticity and nonnegativity constraints. We use the theory of quadratic vector equations to prove several properties of its solutions and suggest new numerical algorithms. In particular, we prove the existence of a certain minimal solution, which does not always coincide with the stochastic one that is required by the problem. We use an interpretation of the solution as a Perron eigenvector to devise new fixed-point algorithms for its computation, and pair them with a homotopy continuation strategy. The resulting numerical method is more reliable than the existing alternatives, being able to solve a larger number of problems

    Generalized Induced Norms

    Get PDF
    Let ||.|| be a norm on the algebra M_n of all n-by-n matrices over the complex field C. An interesting problem in matrix theory is that "are there two norms ||.||_1 and ||.||_2 on C^n such that ||A||=max{||Ax||_2: ||x||_1=1} for all A in M_n. We will investigate this problem and its various aspects and will discuss under which conditions ||.||_1=||.||_2.Comment: 8 page

    GSplit LBI: Taming the Procedural Bias in Neuroimaging for Disease Prediction

    Full text link
    In voxel-based neuroimage analysis, lesion features have been the main focus in disease prediction due to their interpretability with respect to the related diseases. However, we observe that there exists another type of features introduced during the preprocessing steps and we call them "\textbf{Procedural Bias}". Besides, such bias can be leveraged to improve classification accuracy. Nevertheless, most existing models suffer from either under-fit without considering procedural bias or poor interpretability without differentiating such bias from lesion ones. In this paper, a novel dual-task algorithm namely \emph{GSplit LBI} is proposed to resolve this problem. By introducing an augmented variable enforced to be structural sparsity with a variable splitting term, the estimators for prediction and selecting lesion features can be optimized separately and mutually monitored by each other following an iterative scheme. Empirical experiments have been evaluated on the Alzheimer's Disease Neuroimaging Initiative\thinspace(ADNI) database. The advantage of proposed model is verified by improved stability of selected lesion features and better classification results.Comment: Conditional Accepted by Miccai,201

    Data Quality Assurance and Performance Measurement of Data Mining for Preventive Maintenance of Power Grid

    Get PDF
    Ensuring reliability as the electrical grid morphs into the "smart grid" will require innovations in how we assess the state of the grid, for the purpose of proactive maintenance, rather than reactive maintenance; in the future, we will not only react to failures, but also try to anticipate and avoid them using predictive modeling (machine learning and data mining) techniques. To help in meeting this challenge, we present the Neutral Online Visualization-aided Autonomic evaluation framework (NOVA) for evaluating machine learning and data mining algorithms for preventive maintenance on the electrical grid. NOVA has three stages provided through a unified user interface: evaluation of input data quality, evaluation of machine learning and data mining results, and evaluation of the reliability improvement of the power grid. A prototype version of NOVA has been deployed for the power grid in New York City, and it is able to evaluate machine learning and data mining systems effectively and efficiently

    Functional Multi-Layer Perceptron: a Nonlinear Tool for Functional Data Analysis

    Get PDF
    In this paper, we study a natural extension of Multi-Layer Perceptrons (MLP) to functional inputs. We show that fundamental results for classical MLP can be extended to functional MLP. We obtain universal approximation results that show the expressive power of functional MLP is comparable to that of numerical MLP. We obtain consistency results which imply that the estimation of optimal parameters for functional MLP is statistically well defined. We finally show on simulated and real world data that the proposed model performs in a very satisfactory way.Comment: http://www.sciencedirect.com/science/journal/0893608

    A Compact Linear Programming Relaxation for Binary Sub-modular MRF

    Full text link
    We propose a novel compact linear programming (LP) relaxation for binary sub-modular MRF in the context of object segmentation. Our model is obtained by linearizing an l1+l_1^+-norm derived from the quadratic programming (QP) form of the MRF energy. The resultant LP model contains significantly fewer variables and constraints compared to the conventional LP relaxation of the MRF energy. In addition, unlike QP which can produce ambiguous labels, our model can be viewed as a quasi-total-variation minimization problem, and it can therefore preserve the discontinuities in the labels. We further establish a relaxation bound between our LP model and the conventional LP model. In the experiments, we demonstrate our method for the task of interactive object segmentation. Our LP model outperforms QP when converting the continuous labels to binary labels using different threshold values on the entire Oxford interactive segmentation dataset. The computational complexity of our LP is of the same order as that of the QP, and it is significantly lower than the conventional LP relaxation

    Some extremal functions in Fourier analysis, III

    Full text link
    We obtain the best approximation in L1(R)L^1(\R), by entire functions of exponential type, for a class of even functions that includes eλxe^{-\lambda|x|}, where λ>0\lambda >0, logx\log |x| and xα|x|^{\alpha}, where 1<α<1-1 < \alpha < 1. We also give periodic versions of these results where the approximating functions are trigonometric polynomials of bounded degree.Comment: 26 pages. Submitte
    corecore