2,671 research outputs found

    Recurrent Models for Situation Recognition

    Full text link
    This work proposes Recurrent Neural Network (RNN) models to predict structured 'image situations' -- actions and noun entities fulfilling semantic roles related to the action. In contrast to prior work relying on Conditional Random Fields (CRFs), we use a specialized action prediction network followed by an RNN for noun prediction. Our system obtains state-of-the-art accuracy on the challenging recent imSitu dataset, beating CRF-based models, including ones trained with additional data. Further, we show that specialized features learned from situation prediction can be transferred to the task of image captioning to more accurately describe human-object interactions.Comment: To appear at ICCV 201

    PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning

    Full text link
    This paper presents a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting. Inspired by network pruning techniques, we exploit redundancies in large deep networks to free up parameters that can then be employed to learn new tasks. By performing iterative pruning and network re-training, we are able to sequentially "pack" multiple tasks into a single network while ensuring minimal drop in performance and minimal storage overhead. Unlike prior work that uses proxy losses to maintain accuracy on older tasks, we always optimize for the task at hand. We perform extensive experiments on a variety of network architectures and large-scale datasets, and observe much better robustness against catastrophic forgetting than prior work. In particular, we are able to add three fine-grained classification tasks to a single ImageNet-trained VGG-16 network and achieve accuracies close to those of separately trained networks for each task. Code available at https://github.com/arunmallya/packne

    Adaptive Object Detection Using Adjacency and Zoom Prediction

    Full text link
    State-of-the-art object detection systems rely on an accurate set of region proposals. Several recent methods use a neural network architecture to hypothesize promising object locations. While these approaches are computationally efficient, they rely on fixed image regions as anchors for predictions. In this paper we propose to use a search strategy that adaptively directs computational resources to sub-regions likely to contain objects. Compared to methods based on fixed anchor locations, our approach naturally adapts to cases where object instances are sparse and small. Our approach is comparable in terms of accuracy to the state-of-the-art Faster R-CNN approach while using two orders of magnitude fewer anchors on average. Code is publicly available.Comment: Accepted to CVPR 201

    Lower Bounds for the Cop Number When the Robber is Fast

    Full text link
    We consider a variant of the Cops and Robbers game where the robber can move t edges at a time, and show that in this variant, the cop number of a d-regular graph with girth larger than 2t+2 is Omega(d^t). By the known upper bounds on the order of cages, this implies that the cop number of a connected n-vertex graph can be as large as Omega(n^{2/3}) if t>1, and Omega(n^{4/5}) if t>3. This improves the Omega(n^{(t-3)/(t-2)}) lower bound of Frieze, Krivelevich, and Loh (Variations on Cops and Robbers, J. Graph Theory, 2011) when 1<t<7. We also conjecture a general upper bound O(n^{t/t+1}) for the cop number in this variant, generalizing Meyniel's conjecture.Comment: 5 page

    On the Spectrum of Wenger Graphs

    Full text link
    Let q=peq=p^e, where pp is a prime and e1e\geq 1 is an integer. For m1m\geq 1, let PP and LL be two copies of the (m+1)(m+1)-dimensional vector spaces over the finite field Fq\mathbb{F}_q. Consider the bipartite graph Wm(q)W_m(q) with partite sets PP and LL defined as follows: a point (p)=(p1,p2,,pm+1)P(p)=(p_1,p_2,\ldots,p_{m+1})\in P is adjacent to a line [l]=[l1,l2,,lm+1]L[l]=[l_1,l_2,\ldots,l_{m+1}]\in L if and only if the following mm equalities hold: li+1+pi+1=lip1l_{i+1} + p_{i+1}=l_{i}p_1 for i=1,,mi=1,\ldots, m. We call the graphs Wm(q)W_m(q) Wenger graphs. In this paper, we determine all distinct eigenvalues of the adjacency matrix of Wm(q)W_m(q) and their multiplicities. We also survey results on Wenger graphs.Comment: 9 pages; accepted for publication to J. Combin. Theory, Series
    corecore