342 research outputs found

    Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL

    Full text link
    Recent technological advances have proliferated the available computing power, memory, and speed of modern Central Processing Units (CPUs), Graphics Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs). Consequently, the performance and complexity of Artificial Neural Networks (ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs) currently offer state-of-the-art performance, they consume large amounts of power. Training such networks on CPUs is inefficient, as data throughput and parallel computation is limited. FPGAs are considered a suitable candidate for performance critical, low power systems, e.g. the Internet of Things (IOT) edge devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development environment, networks described using the high-level OpenCL framework can be accelerated on heterogeneous platforms. Moreover, the resource utilization and power consumption of DNNs can be further enhanced by utilizing regularization techniques that binarize network weights. In this paper, we introduce, to the best of our knowledge, the first FPGA-accelerated stochastically binarized DNN implementations, and compare them to implementations accelerated using both GPUs and FPGAs. Our developed networks are trained and benchmarked using the popular MNIST and CIFAR-10 datasets, and achieve near state-of-the-art performance, while offering a >16-fold improvement in power consumption, compared to conventional GPU-accelerated networks. Both our FPGA-accelerated determinsitic and stochastic BNNs reduce inference times on MNIST and CIFAR-10 by >9.89x and >9.91x, respectively.Comment: 4 pages, 3 figures, 1 tabl

    Design and Implementation of BCM Rule Based on Spike-Timing Dependent Plasticity

    Full text link
    The Bienenstock-Cooper-Munro (BCM) and Spike Timing-Dependent Plasticity (STDP) rules are two experimentally verified form of synaptic plasticity where the alteration of synaptic weight depends upon the rate and the timing of pre- and post-synaptic firing of action potentials, respectively. Previous studies have reported that under specific conditions, i.e. when a random train of Poissonian distributed spikes are used as inputs, and weight changes occur according to STDP, it has been shown that the BCM rule is an emergent property. Here, the applied STDP rule can be either classical pair-based STDP rule, or the more powerful triplet-based STDP rule. In this paper, we demonstrate the use of two distinct VLSI circuit implementations of STDP to examine whether BCM learning is an emergent property of STDP. These circuits are stimulated with random Poissonian spike trains. The first circuit implements the classical pair-based STDP, while the second circuit realizes a previously described triplet-based STDP rule. These two circuits are simulated using 0.35 um CMOS standard model in HSpice simulator. Simulation results demonstrate that the proposed triplet-based STDP circuit significantly produces the threshold-based behaviour of the BCM. Also, the results testify to similar behaviour for the VLSI circuit for pair-based STDP in generating the BCM

    Efficient Design of Triplet Based Spike-Timing Dependent Plasticity

    Full text link
    Spike-Timing Dependent Plasticity (STDP) is believed to play an important role in learning and the formation of computational function in the brain. The classical model of STDP which considers the timing between pairs of pre-synaptic and post-synaptic spikes (p-STDP) is incapable of reproducing synaptic weight changes similar to those seen in biological experiments which investigate the effect of either higher order spike trains (e.g. triplet and quadruplet of spikes), or, simultaneous effect of the rate and timing of spike pairs on synaptic plasticity. In this paper, we firstly investigate synaptic weight changes using a p-STDP circuit and show how it fails to reproduce the mentioned complex biological experiments. We then present a new STDP VLSI circuit which acts based on the timing among triplets of spikes (t-STDP) that is able to reproduce all the mentioned experimental results. We believe that our new STDP VLSI circuit improves upon previous circuits, whose learning capacity exceeds current designs due to its capability of mimicking the outcomes of biological experiments more closely; thus plays a significant role in future VLSI implementation of neuromorphic systems

    An enhanced MOSFET threshold voltage model for the 6–300 K temperature range

    Get PDF
    An enhanced threshold voltage model for MOSFETs operating over a wide range of temperatures (6–300K) is presented. The model takes into account the carrier freeze-out effect and the external field-assisted ionization to address the temperature dependence of MOS transistors. For simplicity, an empirical function is incorporated to predict short channel effects over the temperature range. The results from the proposed model demonstrate good agreement with NMOS and PMOS transistors measured from fabricated chips

    Automated machine learning for healthcare and clinical notes analysis

    Get PDF
    Machine learning (ML) has been slowly entering every aspect of our lives and its positive impact has been astonishing. To accelerate embedding ML in more applications and incorporating it in real-world scenarios, automated machine learning (AutoML) is emerging. The main purpose of AutoML is to provide seamless integration of ML in various industries, which will facilitate better outcomes in everyday tasks. In healthcare, AutoML has been already applied to easier settings with structured data such as tabular lab data. However, there is still a need for applying AutoML for interpreting medical text, which is being generated at a tremendous rate. For this to happen, a promising method is AutoML for clinical notes analysis, which is an unexplored research area representing a gap in ML research. The main objective of this paper is to fill this gap and provide a comprehensive survey and analytical study towards AutoML for clinical notes. To that end, we first introduce the AutoML technology and review its various tools and techniques. We then survey the literature of AutoML in the healthcare industry and discuss the developments specific to clinical settings, as well as those using general AutoML tools for healthcare applications. With this background, we then discuss challenges of working with clinical notes and highlight the benefits of developing AutoML for medical notes processing. Next, we survey relevant ML research for clinical notes and analyze the literature and the field of AutoML in the healthcare industry. Furthermore, we propose future research directions and shed light on the challenges and opportunities this emerging field holds. With this, we aim to assist the community with the implementation of an AutoML platform for medical notes, which if realized can revolutionize patient outcomes

    Training Progressively Binarizing Deep Networks Using FPGAs

    Full text link
    While hardware implementations of inference routines for Binarized Neural Networks (BNNs) are plentiful, current realizations of efficient BNN hardware training accelerators, suitable for Internet of Things (IoT) edge devices, leave much to be desired. Conventional BNN hardware training accelerators perform forward and backward propagations with parameters adopting binary representations, and optimization using parameters adopting floating or fixed-point real-valued representations--requiring two distinct sets of network parameters. In this paper, we propose a hardware-friendly training method that, contrary to conventional methods, progressively binarizes a singular set of fixed-point network parameters, yielding notable reductions in power and resource utilizations. We use the Intel FPGA SDK for OpenCL development environment to train our progressively binarizing DNNs on an OpenVINO FPGA. We benchmark our training approach on both GPUs and FPGAs using CIFAR-10 and compare it to conventional BNNs.Comment: Accepted at 2020 IEEE International Symposium on Circuits and Systems (ISCAS

    Design and analysis of efficient QCA reversible adders

    Get PDF
    Quantum-dot cellular automata (QCA) as an emerging nanotechnology are envisioned to overcome the scaling and the heat dissipation issues of the current CMOS technology. In a QCA structure, information destruction plays an essential role in the overall heat dissipation, and in turn in the power consumption of the system. Therefore, reversible logic, which significantly controls the information flow of the system, is deemed suitable to achieve ultra-low-power structures. In order to benefit from the opportunities QCA and reversible logic provide, in this paper, we first review and implement prior reversible full-adder art in QCA. We then propose a novel reversible design based on three- and five-input majority gates, and a robust one-layer crossover scheme. The new full-adder significantly advances previous designs in terms of the optimization metrics, namely cell count, area, and delay. The proposed efficient full-adder is then used to design reversible ripple-carry adders (RCAs) with different sizes (i.e., 4, 8, and 16 bits). It is demonstrated that the new RCAs lead to 33% less garbage outputs, which can be essential in terms of lowering power consumption. This along with the achieved improvements in area, complexity, and delay introduces an ultra-efficient reversible QCA adder that can be beneficial in developing future computer arithmetic circuits and architecture

    Semi-supervised and weakly-supervised deep neural networks and dataset for fish detection in turbid underwater videos

    Get PDF
    Fish are key members of marine ecosystems, and they have a significant share in the healthy human diet. Besides, fish abundance is an excellent indicator of water quality, as they have adapted to various levels of oxygen, turbidity, nutrients, and pH. To detect various fish in underwater videos, Deep Neural Networks (DNNs) can be of great assistance. However, training DNNs is highly dependent on large, labeled datasets, while labeling fish in turbid underwater video frames is a laborious and time-consuming task, hindering the development of accurate and efficient models for fish detection. To address this problem, firstly, we have collected a dataset called FishInTurbidWater, which consists of a collection of video footage gathered from turbid waters, and quickly and weakly (i.e., giving higher priority to speed over accuracy) labeled them in a 4-times fast-forwarding software. Next, we designed and implemented a semi-supervised contrastive learning fish detection model that is self-supervised using unlabeled data, and then fine-tuned with a small fraction (20%) of our weakly labeled FishInTurbidWater data. At the next step, we trained, using our weakly labeled data, a novel weakly-supervised ensemble DNN with transfer learning from ImageNet. The results show that our semi-supervised contrastive model leads to more than 20 times faster turnaround time between dataset collection and result generation, with reasonably high accuracy (89%). At the same time, the proposed weakly-supervised ensemble model can detect fish in turbid waters with high (94%) accuracy, while still cutting the development time by a factor of four, compared to fully-supervised models trained on carefully labeled datasets. Our dataset and code are publicly available at the hyperlink FishInTurbidWater

    Variation-aware binarized memristive networks

    Get PDF
    The quantization of weights to binary states in Deep Neural Networks (DNNs) can replace resource-hungry multiply accumulate operations with simple accumulations. Such Binarized Neural Networks (BNNs) exhibit greatly reduced resource and power requirements. In addition, memristors have been shown as promising synaptic weight elements in DNNs. In this paper, we propose and simulate novel Binarized Memristive Convolutional Neural Network (BMCNN) architectures employing hybrid weight and parameter representations. We train the proposed architectures offline and then map the trained parameters to our binarized memristive devices for inference. To take into account the variations in memristive devices, and to study their effect on the performance, we introduce variations in R ON and R OFF . Moreover, we introduce means to mitigate the adverse effect of memristive variations in our proposed networks. Finally, we benchmark our BMCNNs and variation-aware BMCNNs using the MNIST dataset

    Robotic spot spraying of Harrisia cactus (Harrisia martinii) in grazing pastures of the Australian rangelands

    Get PDF
    Harrisia cactus, Harrisia martinii, is a serious weed affecting hundreds of thousands of hectares of native pasture in the Australian rangelands. Despite the landmark success of past biological control agents for the invasive weed and significant investment in its eradication by the Queensland Government (roughly 156Msince1960),itstilltakesholdinthecoolerrangelandenvironmentsofnorthernNewSouthWalesandsouthernQueensland.Inthepastdecade,landholderswithlargeinfestationsintheselocationshavespentapproximately156M since 1960), it still takes hold in the cooler rangeland environments of northern New South Wales and southern Queensland. In the past decade, landholders with large infestations in these locations have spent approximately 20,000 to 30,000perannumonherbicidecontrolmeasurestoreducetheimpactoftheweedontheirgrazingoperations.Currentchemicalcontrolrequiresmanualhandspotsprayingwithhighquantitiesofherbicideforfoliarapplication.Thesemethodsarelabourintensiveandcostly,andinsomecasesinhibitlandholdersfromperformingcontrolatall.Roboticspotsprayingoffersapotentialsolutiontotheseissues,butexistingsolutionsarenotsuitablefortherangelandenvironment.Thisworkpresentsthemethodsandresultsofaninsitufieldtrialofanovelroboticspotsprayingsolution,AutoWeed,fortreatingharrisiacactusthat(1)morethanhalvestheoperationtime,(2)canreduceherbicideusagebyupto5430,000 per annum on herbicide control measures to reduce the impact of the weed on their grazing operations. Current chemical control requires manual hand spot spraying with high quantities of herbicide for foliar application. These methods are labour intensive and costly, and in some cases inhibit landholders from performing control at all. Robotic spot spraying offers a potential solution to these issues, but existing solutions are not suitable for the rangeland environment. This work presents the methods and results of an in situ field trial of a novel robotic spot spraying solution, AutoWeed, for treating harrisia cactus that (1) more than halves the operation time, (2) can reduce herbicide usage by up to 54% and (3) can reduce the cost of herbicide by up to 18.15 per ha compared to the existing hand spraying approach. The AutoWeed spot spraying system used the MobileNetV2 deep learning architecture to perform real time spot spraying of harrisia cactus with 97.2% average recall accuracy and weed knockdown efficacy of up to 96%. Experimental trials showed that the AutoWeed spot sprayer achieved the same level of knockdown of harrisia cactus as traditional hand spraying in low, medium and high density infestations. This work represents a significant step forward for spot spraying of weeds in the Australian rangelands that will reduce labour and herbicide costs for landholders as the technology sees more uptake in the future
    corecore