9 research outputs found
Zero-Annotation Object Detection with Web Knowledge Transfer
Object detection is one of the major problems in computer vision, and has
been extensively studied. Most of the existing detection works rely on
labor-intensive supervision, such as ground truth bounding boxes of objects or
at least image-level annotations. On the contrary, we propose an object
detection method that does not require any form of human annotation on target
tasks, by exploiting freely available web images. In order to facilitate
effective knowledge transfer from web images, we introduce a multi-instance
multi-label domain adaption learning framework with two key innovations. First
of all, we propose an instance-level adversarial domain adaptation network with
attention on foreground objects to transfer the object appearances from web
domain to target domain. Second, to preserve the class-specific semantic
structure of transferred object features, we propose a simultaneous transfer
mechanism to transfer the supervision across domains through pseudo strong
label generation. With our end-to-end framework that simultaneously learns a
weakly supervised detector and transfers knowledge across domains, we achieved
significant improvements over baseline methods on the benchmark datasets.Comment: Accepted in ECCV 201
TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection
This work provides a simple approach to discover tight object bounding boxes with only image-level supervision, called Tight box mining with Surrounding Segmentation Context (TS2C). We observe that object candidates mined through current multiple instance learning methods are usually trapped to discriminative object parts, rather than the entire object. TS2C leverages surrounding segmentation context derived from weakly-supervised segmentation to suppress such low-quality distracting candidates and boost the high-quality ones. Specifically, TS2C is developed based on two key properties of desirable bounding boxes: (1) high purity, meaning most pixels in the box are with high object response, and (2) high completeness, meaning the box covers high object response pixels comprehensively. With such novel and computable criteria, more tight candidates can be discovered for learning a better object detector. With TS2C, we obtain 48.0% and 44.4% mAP scores on VOC 2007 and 2012 benchmarks, which are the new state-of-the-arts
Forget and diversify: Regularized refinement for weakly supervised object detection
We study weakly supervised learning for object detectors, where training images have image-level class labels only. This problem is often addressed by multiple instance learning, where pseudo-labels of proposals are constructed from image-level weak labels and detectors are learned from the potentially noisy labels. Since existing methods train models in a discriminative manner, they typically suffer from collapsing into salient parts and also fail in localizing multiple instances within an image. To alleviate such limitations, we propose simple yet effective regularization techniques, weight reinitialization and labeling perturbations, which prevent overfitting to noisy labels by forgetting biased weights. We also introduce a graph-based mode-seeking technique that identifies multiple object instances in a principled way. The combination of the two proposed techniques reduces overfitting observed frequently in weakly supervised setting, and greatly improves object localization performance in standard benchmarks.OAIID:RECH_ACHV_DSTSH_NO:T201909100RECH_ACHV_FG:RR00200001ADJUST_YN:EMP_ID:A080661CITE_RATE:0DEPT_NM:전기·정보공학부EMAIL:[email protected]_YN:NN
