124 research outputs found
The Effect of Soils on Settlement Location in Colonial Tidewater, Virginia
AnthropologyMaster of Arts (M.A.
A Low-Shot Object Counting Network With Iterative Prototype Adaptation
We consider low-shot counting of arbitrary semantic categories in the image
using only few annotated exemplars (few-shot) or no exemplars (no-shot). The
standard few-shot pipeline follows extraction of appearance queries from
exemplars and matching them with image features to infer the object counts.
Existing methods extract queries by feature pooling which neglects the shape
information (e.g., size and aspect) and leads to a reduced object localization
accuracy and count estimates. We propose a Low-shot Object Counting network
with iterative prototype Adaptation (LOCA). Our main contribution is the new
object prototype extraction module, which iteratively fuses the exemplar shape
and appearance information with image features. The module is easily adapted to
zero-shot scenarios, enabling LOCA to cover the entire spectrum of low-shot
counting problems. LOCA outperforms all recent state-of-the-art methods on
FSC147 benchmark by 20-30% in RMSE on one-shot and few-shot and achieves
state-of-the-art on zero-shot scenarios, while demonstrating better
generalization capabilities.Comment: Accepted to ICCV2023, code: https://github.com/djukicn/loc
Trans2k: Unlocking the Power of Deep Models for Transparent Object Tracking
Visual object tracking has focused predominantly on opaque objects, while
transparent object tracking received very little attention. Motivated by the
uniqueness of transparent objects in that their appearance is directly affected
by the background, the first dedicated evaluation dataset has emerged recently.
We contribute to this effort by proposing the first transparent object tracking
training dataset Trans2k that consists of over 2k sequences with 104,343 images
overall, annotated by bounding boxes and segmentation masks. Noting that
transparent objects can be realistically rendered by modern renderers, we
quantify domain-specific attributes and render the dataset containing visual
attributes and tracking situations not covered in the existing object training
datasets. We observe a consistent performance boost (up to 16%) across a
diverse set of modern tracking architectures when trained using Trans2k, and
show insights not previously possible due to the lack of appropriate training
sets. The dataset and the rendering engine will be publicly released to unlock
the power of modern learning-based trackers and foster new designs in
transparent object tracking.Comment: Accepted to BMVC 2022. Project page:
https://github.com/trojerz/Trans2
Beyond standard benchmarks::Parameterizing performance evaluation in visual object tracking
Object-to-camera motion produces a variety of apparent motion patterns that significantly affect performance of short-term visual trackers. Despite being crucial for designing robust trackers, their influence is poorly explored in standard benchmarks due to weakly defined, biased and overlapping attribute annotations. In this paper we propose to go beyond pre-recorded benchmarks with post-hoc annotations by presenting an approach that utilizes omnidirectional videos to generate realistic, consistently annotated, short-term tracking scenarios with exactly parameterized motion patterns. We have created an evaluation system, constructed a fully annotated dataset of omnidirectionalvideos and generators for typical motion patterns. We provide an in-depth analysis of major tracking paradigms which is complementary to the standard benchmarks and confirms the expressiveness of our evaluation approach
An Architecture for Context-Aware Food and Beverage Preparation Systems
This paper introduces a universal architecture for CONtext-aware Food and bEverage preperation System (CONFES) addressing the optimization issue in food and beverage preparation, with the aim of achieving nutritious, sustainable, and tasteful results. The concept is based on a comprehensive review of the state of the art in Machine Learning (ML) approaches for food preparation, and the latest technical developments in Cyber-Physical System (CPS). The system requirements, overarching architecture, essential components, and data model for CONFES are defined, leading to a more concrete case study. The latter describes a context-aware coffee machine as a practical implementation of the proposed architecture. The study demonstrates how CONFES can be customized to meet the specific requirements of a coffee machine, showcasing the adaptability and versatility of the overall architectural framework. The research findings contribute to the development of intelligent and context-aware systems in the domain of food and beverage preparation
Object Tracking by Reconstruction with View-Specific Discriminative Correlation Filters
Standard RGB-D trackers treat the target as an inherently 2D structure, which
makes modelling appearance changes related even to simple out-of-plane rotation
highly challenging. We address this limitation by proposing a novel long-term
RGB-D tracker - Object Tracking by Reconstruction (OTR). The tracker performs
online 3D target reconstruction to facilitate robust learning of a set of
view-specific discriminative correlation filters (DCFs). The 3D reconstruction
supports two performance-enhancing features: (i) generation of accurate spatial
support for constrained DCF learning from its 2D projection and (ii) point
cloud based estimation of 3D pose change for selection and storage of
view-specific DCFs which are used to robustly localize the target after
out-of-view rotation or heavy occlusion. Extensive evaluation of OTR on the
challenging Princeton RGB-D tracking and STC Benchmarks shows it outperforms
the state-of-the-art by a large margin
Dense Feature Aggregation and Pruning for RGBT Tracking
How to perform effective information fusion of different modalities is a core
factor in boosting the performance of RGBT tracking. This paper presents a
novel deep fusion algorithm based on the representations from an end-to-end
trained convolutional neural network. To deploy the complementarity of features
of all layers, we propose a recursive strategy to densely aggregate these
features that yield robust representations of target objects in each modality.
In different modalities, we propose to prune the densely aggregated features of
all modalities in a collaborative way. In a specific, we employ the operations
of global average pooling and weighted random selection to perform channel
scoring and selection, which could remove redundant and noisy features to
achieve more robust feature representation. Experimental results on two RGBT
tracking benchmark datasets suggest that our tracker achieves clear
state-of-the-art against other RGB and RGBT tracking methods.Comment: arXiv admin note: text overlap with arXiv:1811.0985
The Thermal Infrared Visual Object Tracking VOT-TIR2015 challenge results
The Thermal Infrared Visual Object Tracking challenge 2015, VOT-TIR2015, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply pre-learned models of object appearance. VOT-TIR2015 is the first benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2015 challenge is based on the VOT2013 challenge, but introduces the following novelties: (i) the newly collected LTIR (Link - ping TIR) dataset is used, (ii) the VOT2013 attributes are adapted to TIR data, (iii) the evaluation is performed using insights gained during VOT2013 and VOT2014 and is similar to VOT2015
The visual object tracking VOT2017 challenge results
The Visual Object Tracking challenge VOT2017 is the fifth annual tracker benchmarking activity organized by the VOT initiative. Results of 51 trackers are presented; many are state-of-the-art published at major computer vision conferences or journals in recent years. The evaluation included the standard VOT and other popular methodologies and a new 'real-time' experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The VOT2017 goes beyond its predecessors by (i) improving the VOT public dataset and introducing a separate VOT2017 sequestered dataset, (ii) introducing a realtime tracking experiment and (iii) releasing a redesigned toolkit that supports complex experiments. The dataset, the evaluation kit and the results are publicly available at the challenge website1
- …
