2,671 research outputs found
Recurrent Models for Situation Recognition
This work proposes Recurrent Neural Network (RNN) models to predict
structured 'image situations' -- actions and noun entities fulfilling semantic
roles related to the action. In contrast to prior work relying on Conditional
Random Fields (CRFs), we use a specialized action prediction network followed
by an RNN for noun prediction. Our system obtains state-of-the-art accuracy on
the challenging recent imSitu dataset, beating CRF-based models, including ones
trained with additional data. Further, we show that specialized features
learned from situation prediction can be transferred to the task of image
captioning to more accurately describe human-object interactions.Comment: To appear at ICCV 201
PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
This paper presents a method for adding multiple tasks to a single deep
neural network while avoiding catastrophic forgetting. Inspired by network
pruning techniques, we exploit redundancies in large deep networks to free up
parameters that can then be employed to learn new tasks. By performing
iterative pruning and network re-training, we are able to sequentially "pack"
multiple tasks into a single network while ensuring minimal drop in performance
and minimal storage overhead. Unlike prior work that uses proxy losses to
maintain accuracy on older tasks, we always optimize for the task at hand. We
perform extensive experiments on a variety of network architectures and
large-scale datasets, and observe much better robustness against catastrophic
forgetting than prior work. In particular, we are able to add three
fine-grained classification tasks to a single ImageNet-trained VGG-16 network
and achieve accuracies close to those of separately trained networks for each
task. Code available at https://github.com/arunmallya/packne
Adaptive Object Detection Using Adjacency and Zoom Prediction
State-of-the-art object detection systems rely on an accurate set of region
proposals. Several recent methods use a neural network architecture to
hypothesize promising object locations. While these approaches are
computationally efficient, they rely on fixed image regions as anchors for
predictions. In this paper we propose to use a search strategy that adaptively
directs computational resources to sub-regions likely to contain objects.
Compared to methods based on fixed anchor locations, our approach naturally
adapts to cases where object instances are sparse and small. Our approach is
comparable in terms of accuracy to the state-of-the-art Faster R-CNN approach
while using two orders of magnitude fewer anchors on average. Code is publicly
available.Comment: Accepted to CVPR 201
Lower Bounds for the Cop Number When the Robber is Fast
We consider a variant of the Cops and Robbers game where the robber can move
t edges at a time, and show that in this variant, the cop number of a d-regular
graph with girth larger than 2t+2 is Omega(d^t). By the known upper bounds on
the order of cages, this implies that the cop number of a connected n-vertex
graph can be as large as Omega(n^{2/3}) if t>1, and Omega(n^{4/5}) if t>3. This
improves the Omega(n^{(t-3)/(t-2)}) lower bound of Frieze, Krivelevich, and Loh
(Variations on Cops and Robbers, J. Graph Theory, 2011) when 1<t<7. We also
conjecture a general upper bound O(n^{t/t+1}) for the cop number in this
variant, generalizing Meyniel's conjecture.Comment: 5 page
On the Spectrum of Wenger Graphs
Let , where is a prime and is an integer. For ,
let and be two copies of the -dimensional vector spaces over the
finite field . Consider the bipartite graph with partite
sets and defined as follows: a point is adjacent to a line if and only if the
following equalities hold: for . We call the graphs Wenger graphs. In this paper, we determine all
distinct eigenvalues of the adjacency matrix of and their
multiplicities. We also survey results on Wenger graphs.Comment: 9 pages; accepted for publication to J. Combin. Theory, Series
- …
