108 research outputs found
Deep learning systems as complex networks
Thanks to the availability of large scale digital datasets and massive
amounts of computational power, deep learning algorithms can learn
representations of data by exploiting multiple levels of abstraction. These
machine learning methods have greatly improved the state-of-the-art in many
challenging cognitive tasks, such as visual object recognition, speech
processing, natural language understanding and automatic translation. In
particular, one class of deep learning models, known as deep belief networks,
can discover intricate statistical structure in large data sets in a completely
unsupervised fashion, by learning a generative model of the data using
Hebbian-like learning mechanisms. Although these self-organizing systems can be
conveniently formalized within the framework of statistical mechanics, their
internal functioning remains opaque, because their emergent dynamics cannot be
solved analytically. In this article we propose to study deep belief networks
using techniques commonly employed in the study of complex networks, in order
to gain some insights into the structural and functional properties of the
computational graph resulting from the learning process.Comment: 20 pages, 9 figure
Cognition-Based Networks: A New Perspective on Network Optimization Using Learning and Distributed Intelligence
IEEE Access
Volume 3, 2015, Article number 7217798, Pages 1512-1530
Open Access
Cognition-based networks: A new perspective on network optimization using learning and distributed intelligence (Article)
Zorzi, M.a , Zanella, A.a, Testolin, A.b, De Filippo De Grazia, M.b, Zorzi, M.bc
a Department of Information Engineering, University of Padua, Padua, Italy
b Department of General Psychology, University of Padua, Padua, Italy
c IRCCS San Camillo Foundation, Venice-Lido, Italy
View additional affiliations
View references (107)
Abstract
In response to the new challenges in the design and operation of communication networks, and taking inspiration from how living beings deal with complexity and scalability, in this paper we introduce an innovative system concept called COgnition-BAsed NETworkS (COBANETS). The proposed approach develops around the systematic application of advanced machine learning techniques and, in particular, unsupervised deep learning and probabilistic generative models for system-wide learning, modeling, optimization, and data representation. Moreover, in COBANETS, we propose to combine this learning architecture with the emerging network virtualization paradigms, which make it possible to actuate automatic optimization and reconfiguration strategies at the system level, thus fully unleashing the potential of the learning approach. Compared with the past and current research efforts in this area, the technical approach outlined in this paper is deeply interdisciplinary and more comprehensive, calling for the synergic combination of expertise of computer scientists, communications and networking engineers, and cognitive scientists, with the ultimate aim of breaking new ground through a profound rethinking of how the modern understanding of cognition can be used in the management and optimization of telecommunication network
Can neural networks do arithmetic? A survey on the elementary numerical skills of state-of-the-art deep learning models
Creating learning models that can exhibit sophisticated reasoning skills is
one of the greatest challenges in deep learning research, and mathematics is
rapidly becoming one of the target domains for assessing scientific progress in
this direction. In the past few years there has been an explosion of neural
network architectures, data sets, and benchmarks specifically designed to
tackle mathematical problems, reporting notable success in disparate fields
such as automated theorem proving, numerical integration, and discovery of new
conjectures or matrix multiplication algorithms. However, despite these
impressive achievements it is still unclear whether deep learning models
possess an elementary understanding of quantities and symbolic numbers. In this
survey we critically examine the recent literature, concluding that even
state-of-the-art architectures often fall short when probed with relatively
simple tasks designed to test basic numerical and arithmetic knowledge
Modeling cognition with generative neural networks: The case of orthographic processing
This thesis investigates the potential of generative neural networks to model cognitive processes. In contrast to many popular connectionist models, the computational framework adopted in this research work emphasizes the generative nature of cognition, suggesting that one of the primary goals of cognitive systems is to learn an internal model of the surrounding environment that can be used to infer causes and make predictions about the upcoming sensory information. In particular, we consider a powerful class of recurrent neural networks that learn probabilistic generative models from experience in a completely unsupervised way, by extracting high-order statistical structure from a set of observed variables. Notably, this type of networks can be conveniently formalized within the more general framework of probabilistic graphical models, which provides a unified language to describe both neural networks and structured Bayesian models. Moreover, recent advances allow to extend basic network architectures to build more powerful systems, which exploit multiple processing stages to perform learning and inference over hierarchical models, or which exploit delayed recurrent connections to process sequential information. We argue that these advanced network architectures constitute a promising alternative to the more traditional, feed-forward, supervised neural networks, because they more neatly capture the functional and structural organization of cortical circuits, providing a principled way to combine top-down, high-level contextual information with bottom-up, sensory evidence. We provide empirical support justifying the use of these models by studying how efficient implementations of hierarchical and temporal generative networks can extract information from large datasets containing thousands of patterns. In particular, we perform computational simulations of recognition of handwritten and printed characters belonging to different writing scripts, which are successively combined spatially or temporally in order to build more complex orthographic units such as those constituting English words
Visual Enumeration is Challenging for Large-scale Generative AI
Humans can readily judge the number of objects in a visual scene, even
without counting, and such a skill has been documented in many animal species
and babies prior to language development and formal schooling. Numerical
judgments are error-free for small sets, while for larger collections responses
become approximate, with variability increasing proportionally to the target
number. This response pattern is observed for items of all kinds, despite
variation in object features (such as color or shape), suggesting that our
visual number sense relies on abstract representations of numerosity. Here, we
investigate whether large-scale generative Artificial Intelligence (AI) systems
have a human-like number sense, which should allow them to reliably name the
number of objects in simple visual stimuli or generate images containing a
target number of items in the 1-10 range. Surprisingly, most of the foundation
models considered have a poor number sense: They make striking errors even with
small numbers, the response variability does not increase in a systematic way,
and the pattern of errors depends on object category. Only the most recent
proprietary systems exhibit signatures of a visual number sense. Our findings
demonstrate that having an intuitive visual understanding of number remains
challenging for foundation models, which in turn might be detrimental to the
perceptual grounding of numeracy that in humans is crucial for mathematical
learning
Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies
Large Language Models (LLMs) have revolutionized the field of Natural
Language Processing thanks to their ability to reuse knowledge acquired on
massive text corpora on a wide variety of downstream tasks, with minimal (if
any) tuning steps. At the same time, it has been repeatedly shown that LLMs
lack systematic generalization, which allows to extrapolate the learned
statistical regularities outside the training distribution. In this work, we
offer a systematic benchmarking of GPT-4, one of the most advanced LLMs
available, on three algorithmic tasks characterized by the possibility to
control the problem difficulty with two parameters. We compare the performance
of GPT-4 with that of its predecessor (GPT-3.5) and with a variant of the
Transformer-Encoder architecture recently introduced to solve similar tasks,
the Neural Data Router. We find that the deployment of advanced prompting
techniques allows GPT-4 to reach superior accuracy on all tasks, demonstrating
that state-of-the-art LLMs constitute a very strong baseline also in
challenging tasks that require systematic generalization.Comment: Accepted at LREC-COLING 2024. Added acknowledgement
Emergence of Network Motifs in Deep Neural Networks
Network science can offer fundamental insights into the structural and
functional properties of complex systems. For example, it is widely known that
neuronal circuits tend to organize into basic functional topological modules,
called "network motifs". In this article we show that network science tools can
be successfully applied also to the study of artificial neural networks
operating according to self-organizing (learning) principles. In particular, we
study the emergence of network motifs in multi-layer perceptrons, whose initial
connectivity is defined as a stack of fully-connected, bipartite graphs. Our
simulations show that the final network topology is primarily shaped by
learning dynamics, but can be strongly biased by choosing appropriate weight
initialization schemes. Overall, our results suggest that non-trivial
initialization strategies can make learning more effective by promoting the
development of useful network motifs, which are often surprisingly consistent
with those observed in general transduction networks
Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models
Large Language Models (LLMs) achieve impressive performance in a wide range
of tasks, even if they are often trained with the only objective of chatting
fluently with users. Among other skills, LLMs show emergent abilities in
mathematical reasoning benchmarks, which can be elicited with appropriate
prompting methods. In this work, we systematically investigate the capabilities
and limitations of popular open-source LLMs on different symbolic reasoning
tasks. We evaluate three models of the Llama 2 family on two datasets that
require solving mathematical formulas of varying degrees of difficulty. We test
a generalist LLM (Llama 2 Chat) as well as two fine-tuned versions of Llama 2
(MAmmoTH and MetaMath) specifically designed to tackle mathematical problems.
We observe that both increasing the scale of the model and fine-tuning it on
relevant tasks lead to significant performance gains. Furthermore, using
fine-grained evaluation measures, we find that such performance gains are
mostly observed with mathematical formulas of low complexity, which
nevertheless often remain challenging even for the largest fine-tuned models.Comment: Accepted at 33rd International Conference on Artificial Neural
Networks (ICANN24
- …
