Search CORE

2,533 research outputs found

Learning Combinations of Activation Functions

Author: Manessi Franco
Rozza Alessandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/04/2019
Field of study

In the last decade, an active area of research has been devoted to design novel activation functions that are able to help deep neural networks to converge, obtaining better performance. The training procedure of these architectures usually involves optimization of the weights of their layers only, while non-linearities are generally pre-specified and their (possible) parameters are usually considered as hyper-parameters to be tuned manually. In this paper, we introduce two approaches to automatically learn different combinations of base activation functions (such as the identity function, ReLU, and tanh) during the training phase. We present a thorough comparison of our novel approaches with well-known architectures (such as LeNet-5, AlexNet, and ResNet-56) on three standard datasets (Fashion-MNIST, CIFAR-10, and ILSVRC-2012), showing substantial improvements in the overall performance, such as an increase in the top-1 accuracy for AlexNet on ILSVRC-2012 of 3.01 percentage points.Comment: 6 pages, 3 figures. Published as a conference paper at ICPR 2018. Code: https://bitbucket.org/francux/learning_combinations_of_activation_function

arXiv.org e-Print Archive

Crossref

Reduced formulation of a steady fluid-structure interaction problem with parametric coupling

Author: Lassila Toni
Rozza Gianluigi
Publication venue
Publication date: 19/05/2010
Field of study

We propose a two-fold approach to model reduction of fluid-structure interaction. The state equations for the fluid are solved with reduced basis methods. These are model reduction methods for parametric partial differential equations using well-chosen snapshot solutions in order to build a set of global basis functions. The other reduction is in terms of the geometric complexity of the moving fluid-structure interface. We use free-form deformations to parameterize the perturbation of the flow channel at rest configuration. As a computational example we consider a steady fluid-structure interaction problem: an incmpressible Stokes flow in a channel that has a flexible wall.Comment: 10 pages, 3 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Automated Pruning for Deep Neural Network Compression

Author: Bianco Simone
Manessi Franco
Napoletano Paolo
Rozza Alessandro
Schettini Raimondo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/01/2019
Field of study

In this work we present a method to improve the pruning step of the current state-of-the-art methodology to compress neural networks. The novelty of the proposed pruning technique is in its differentiability, which allows pruning to be performed during the backpropagation phase of the network training. This enables an end-to-end learning and strongly reduces the training time. The technique is based on a family of differentiable pruning functions and a new regularizer specifically designed to enforce pruning. The experimental results show that the joint optimization of both the thresholds and the network weights permits to reach a higher compression rate, reducing the number of weights of the pruned network by a further 14% to 33% compared to the current state-of-the-art. Furthermore, we believe that this is the first study where the generalization capabilities in transfer learning tasks of the features extracted by a pruned network are analyzed. To achieve this goal, we show that the representations learned using the proposed pruning methodology maintain the same effectiveness and generality of those learned by the corresponding non-compressed network on a set of different recognition tasks.Comment: 8 pages, 5 figures. Published as a conference paper at ICPR 201

arXiv.org e-Print Archive

Crossref