108 research outputs found

    Deep learning systems as complex networks

    Full text link
    Thanks to the availability of large scale digital datasets and massive amounts of computational power, deep learning algorithms can learn representations of data by exploiting multiple levels of abstraction. These machine learning methods have greatly improved the state-of-the-art in many challenging cognitive tasks, such as visual object recognition, speech processing, natural language understanding and automatic translation. In particular, one class of deep learning models, known as deep belief networks, can discover intricate statistical structure in large data sets in a completely unsupervised fashion, by learning a generative model of the data using Hebbian-like learning mechanisms. Although these self-organizing systems can be conveniently formalized within the framework of statistical mechanics, their internal functioning remains opaque, because their emergent dynamics cannot be solved analytically. In this article we propose to study deep belief networks using techniques commonly employed in the study of complex networks, in order to gain some insights into the structural and functional properties of the computational graph resulting from the learning process.Comment: 20 pages, 9 figure

    Cognition-Based Networks: A New Perspective on Network Optimization Using Learning and Distributed Intelligence

    Get PDF
    IEEE Access Volume 3, 2015, Article number 7217798, Pages 1512-1530 Open Access Cognition-based networks: A new perspective on network optimization using learning and distributed intelligence (Article) Zorzi, M.a , Zanella, A.a, Testolin, A.b, De Filippo De Grazia, M.b, Zorzi, M.bc a Department of Information Engineering, University of Padua, Padua, Italy b Department of General Psychology, University of Padua, Padua, Italy c IRCCS San Camillo Foundation, Venice-Lido, Italy View additional affiliations View references (107) Abstract In response to the new challenges in the design and operation of communication networks, and taking inspiration from how living beings deal with complexity and scalability, in this paper we introduce an innovative system concept called COgnition-BAsed NETworkS (COBANETS). The proposed approach develops around the systematic application of advanced machine learning techniques and, in particular, unsupervised deep learning and probabilistic generative models for system-wide learning, modeling, optimization, and data representation. Moreover, in COBANETS, we propose to combine this learning architecture with the emerging network virtualization paradigms, which make it possible to actuate automatic optimization and reconfiguration strategies at the system level, thus fully unleashing the potential of the learning approach. Compared with the past and current research efforts in this area, the technical approach outlined in this paper is deeply interdisciplinary and more comprehensive, calling for the synergic combination of expertise of computer scientists, communications and networking engineers, and cognitive scientists, with the ultimate aim of breaking new ground through a profound rethinking of how the modern understanding of cognition can be used in the management and optimization of telecommunication network

    Can neural networks do arithmetic? A survey on the elementary numerical skills of state-of-the-art deep learning models

    Get PDF
    Creating learning models that can exhibit sophisticated reasoning skills is one of the greatest challenges in deep learning research, and mathematics is rapidly becoming one of the target domains for assessing scientific progress in this direction. In the past few years there has been an explosion of neural network architectures, data sets, and benchmarks specifically designed to tackle mathematical problems, reporting notable success in disparate fields such as automated theorem proving, numerical integration, and discovery of new conjectures or matrix multiplication algorithms. However, despite these impressive achievements it is still unclear whether deep learning models possess an elementary understanding of quantities and symbolic numbers. In this survey we critically examine the recent literature, concluding that even state-of-the-art architectures often fall short when probed with relatively simple tasks designed to test basic numerical and arithmetic knowledge

    Modeling cognition with generative neural networks: The case of orthographic processing

    Get PDF
    This thesis investigates the potential of generative neural networks to model cognitive processes. In contrast to many popular connectionist models, the computational framework adopted in this research work emphasizes the generative nature of cognition, suggesting that one of the primary goals of cognitive systems is to learn an internal model of the surrounding environment that can be used to infer causes and make predictions about the upcoming sensory information. In particular, we consider a powerful class of recurrent neural networks that learn probabilistic generative models from experience in a completely unsupervised way, by extracting high-order statistical structure from a set of observed variables. Notably, this type of networks can be conveniently formalized within the more general framework of probabilistic graphical models, which provides a unified language to describe both neural networks and structured Bayesian models. Moreover, recent advances allow to extend basic network architectures to build more powerful systems, which exploit multiple processing stages to perform learning and inference over hierarchical models, or which exploit delayed recurrent connections to process sequential information. We argue that these advanced network architectures constitute a promising alternative to the more traditional, feed-forward, supervised neural networks, because they more neatly capture the functional and structural organization of cortical circuits, providing a principled way to combine top-down, high-level contextual information with bottom-up, sensory evidence. We provide empirical support justifying the use of these models by studying how efficient implementations of hierarchical and temporal generative networks can extract information from large datasets containing thousands of patterns. In particular, we perform computational simulations of recognition of handwritten and printed characters belonging to different writing scripts, which are successively combined spatially or temporally in order to build more complex orthographic units such as those constituting English words

    Visual Enumeration is Challenging for Large-scale Generative AI

    Full text link
    Humans can readily judge the number of objects in a visual scene, even without counting, and such a skill has been documented in many animal species and babies prior to language development and formal schooling. Numerical judgments are error-free for small sets, while for larger collections responses become approximate, with variability increasing proportionally to the target number. This response pattern is observed for items of all kinds, despite variation in object features (such as color or shape), suggesting that our visual number sense relies on abstract representations of numerosity. Here, we investigate whether large-scale generative Artificial Intelligence (AI) systems have a human-like number sense, which should allow them to reliably name the number of objects in simple visual stimuli or generate images containing a target number of items in the 1-10 range. Surprisingly, most of the foundation models considered have a poor number sense: They make striking errors even with small numbers, the response variability does not increase in a systematic way, and the pattern of errors depends on object category. Only the most recent proprietary systems exhibit signatures of a visual number sense. Our findings demonstrate that having an intuitive visual understanding of number remains challenging for foundation models, which in turn might be detrimental to the perceptual grounding of numeracy that in humans is crucial for mathematical learning

    Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies

    Full text link
    Large Language Models (LLMs) have revolutionized the field of Natural Language Processing thanks to their ability to reuse knowledge acquired on massive text corpora on a wide variety of downstream tasks, with minimal (if any) tuning steps. At the same time, it has been repeatedly shown that LLMs lack systematic generalization, which allows to extrapolate the learned statistical regularities outside the training distribution. In this work, we offer a systematic benchmarking of GPT-4, one of the most advanced LLMs available, on three algorithmic tasks characterized by the possibility to control the problem difficulty with two parameters. We compare the performance of GPT-4 with that of its predecessor (GPT-3.5) and with a variant of the Transformer-Encoder architecture recently introduced to solve similar tasks, the Neural Data Router. We find that the deployment of advanced prompting techniques allows GPT-4 to reach superior accuracy on all tasks, demonstrating that state-of-the-art LLMs constitute a very strong baseline also in challenging tasks that require systematic generalization.Comment: Accepted at LREC-COLING 2024. Added acknowledgement

    Emergence of Network Motifs in Deep Neural Networks

    Full text link
    Network science can offer fundamental insights into the structural and functional properties of complex systems. For example, it is widely known that neuronal circuits tend to organize into basic functional topological modules, called "network motifs". In this article we show that network science tools can be successfully applied also to the study of artificial neural networks operating according to self-organizing (learning) principles. In particular, we study the emergence of network motifs in multi-layer perceptrons, whose initial connectivity is defined as a stack of fully-connected, bipartite graphs. Our simulations show that the final network topology is primarily shaped by learning dynamics, but can be strongly biased by choosing appropriate weight initialization schemes. Overall, our results suggest that non-trivial initialization strategies can make learning more effective by promoting the development of useful network motifs, which are often surprisingly consistent with those observed in general transduction networks

    Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models

    Full text link
    Large Language Models (LLMs) achieve impressive performance in a wide range of tasks, even if they are often trained with the only objective of chatting fluently with users. Among other skills, LLMs show emergent abilities in mathematical reasoning benchmarks, which can be elicited with appropriate prompting methods. In this work, we systematically investigate the capabilities and limitations of popular open-source LLMs on different symbolic reasoning tasks. We evaluate three models of the Llama 2 family on two datasets that require solving mathematical formulas of varying degrees of difficulty. We test a generalist LLM (Llama 2 Chat) as well as two fine-tuned versions of Llama 2 (MAmmoTH and MetaMath) specifically designed to tackle mathematical problems. We observe that both increasing the scale of the model and fine-tuning it on relevant tasks lead to significant performance gains. Furthermore, using fine-grained evaluation measures, we find that such performance gains are mostly observed with mathematical formulas of low complexity, which nevertheless often remain challenging even for the largest fine-tuned models.Comment: Accepted at 33rd International Conference on Artificial Neural Networks (ICANN24
    corecore