68 research outputs found

    Do retinal ganglion cells project natural scenes to their principal subspace and whiten them?

    Full text link
    Several theories of early sensory processing suggest that it whitens sensory stimuli. Here, we test three key predictions of the whitening theory using recordings from 152 ganglion cells in salamander retina responding to natural movies. We confirm the previous finding that firing rates of ganglion cells are less correlated compared to natural scenes, although significant correlations remain. We show that while the power spectrum of ganglion cells decays less steeply than that of natural scenes, it is not completely flattened. Finally, we find evidence that only the top principal components of the visual stimulus are transmitted.Comment: 2016 Asilomar Conference on Signals, Systems and Computer

    Interpretable Machine Learning with Applications in Neuroscience

    Get PDF
    In the past decade, research in machine learning has been principally focused on the development of algorithms and models with high predictive capabilities. Models such as convolutional neural networks (CNNs) achieve state-of-the-art predictive performance for many tasks in computer vision, autonomous driving, and transfer learning. However, interpreting these models remains a challenge, primarily because of the large number of parameters involved.In this thesis, we investigate two regimes based on (1) compression and (2) stability to build more interpretable machine learning models. These regimes will be demonstrated in a computational neuroscience study. In the first part of the thesis, we introduce a greedy structural compression scheme that prunes filters in a trained CNN. To do this, we define a filter importance index equal to the classification accuracy reduction (CAR) of the network after pruning that filter (similarly defined as RAR for regression). CAR achieves state-of-the-art classification accuracy compared to other filter pruning schemes. Furthermore, we demonstrate the interpretability of CAR-compressed CNNs by showing that our algorithm prunes filters with visually redundant functionalities such as color filters.In the second part of this thesis, we introduce DeepTune, a stability-driven visu- alization and interpretation framework for CNN-based models. DeepTune is used to characterize biological neurons in the V4 area of the primate visual cortex. V4 is a large retinotopically-organized area of the visual cortex located between the primary visual cor- tex and high-level areas in the inferior temporal lobe. V4 neurons have highly nonlinear response properties and it is notoriously di cult to construct quantitative models that accurately describe how visual information is represented in V4. To better understand the filtering properties of these neurons, we study recordings from 71 well isolated cells stimulated with 4000-12000 static grayscale natural images collected by the Gallant Lab at UC Berkeley. Our CNN-based models of V4 neurons achieve state-of-the-art accuracy in predicting neural spike rates in a hold-out validation set (average predictive correlation of 0.53 for 71 neurons). Then, we employ our DeepTune stability-driven interpretation framework and discover that the V4 neurons are tuned to a remarkable diversity of tex- tures (40% of the neurons), contour shapes (30% of the neurons), and complex patterns (30% of the neurons). Most importantly, these smooth DeepTune images provide testable naturalistic stimuli for future experiments on V4 neurons.In the final part of this thesis, we study the application of CAR and RAR compressed CNNs in modeling V4 neurons. Both CAR and RAR compression give rise to a new set of simpler models for V4 neurons with similar accuracy to existing state-of-the-art models. For each of the accurate CAR and RAR compressed models of V4 neurons (up to 90% compression rate), the smooth DeepTune images are stable and exhibit similar patterns to the uncompressed model’s consensus DeepTune image. Our results suggest, to some extent, that these CNNs resemble the structure of the primate brain

    Machine Learning for Uncovering Biological Insights in Spatial Transcriptomics Data

    Full text link
    Development and homeostasis in multicellular systems both require exquisite control over spatial molecular pattern formation and maintenance. Advances in spatially-resolved and high-throughput molecular imaging methods such as multiplexed immunofluorescence and spatial transcriptomics (ST) provide exciting new opportunities to augment our fundamental understanding of these processes in health and disease. The large and complex datasets resulting from these techniques, particularly ST, have led to rapid development of innovative machine learning (ML) tools primarily based on deep learning techniques. These ML tools are now increasingly featured in integrated experimental and computational workflows to disentangle signals from noise in complex biological systems. However, it can be difficult to understand and balance the different implicit assumptions and methodologies of a rapidly expanding toolbox of analytical tools in ST. To address this, we summarize major ST analysis goals that ML can help address and current analysis trends. We also describe four major data science concepts and related heuristics that can help guide practitioners in their choices of the right tools for the right biological questions

    Definitions, methods, and applications in interpretable machine learning.

    Get PDF
    Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the predictive, descriptive, relevant (PDR) framework for discussing interpretations. The PDR framework provides 3 overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post hoc categories, with subgroups including sparsity, modularity, and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often underappreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods

    Zero-shot sampling of adversarial entities in biomedical question answering

    Full text link
    The increasing depth of parametric domain knowledge in large language models (LLMs) is fueling their rapid deployment in real-world applications. In high-stakes and knowledge-intensive tasks, understanding model vulnerabilities is essential for quantifying the trustworthiness of model predictions and regulating their use. The recent discovery of named entities as adversarial examples in natural language processing tasks raises questions about their potential guises in other settings. Here, we propose a powerscaled distance-weighted sampling scheme in embedding space to discover diverse adversarial entities as distractors. We demonstrate its advantage over random sampling in adversarial question answering on biomedical topics. Our approach enables the exploration of different regions on the attack surface, which reveals two regimes of adversarial entities that markedly differ in their characteristics. Moreover, we show that the attacks successfully manipulate token-wise Shapley value explanations, which become deceptive in the adversarial setting. Our investigations illustrate the brittleness of domain knowledge in LLMs and reveal a shortcoming of standard evaluations for high-capacity models.Comment: 20 pages incl. appendix, under revie
    corecore