4,135 research outputs found
Learning to Estimate 3D Hand Pose from Single RGB Images
Low-cost consumer depth cameras and deep learning have enabled reasonable 3D
hand pose estimation from single depth images. In this paper, we present an
approach that estimates 3D hand pose from regular RGB images. This task has far
more ambiguities due to the missing depth information. To this end, we propose
a deep network that learns a network-implicit 3D articulation prior. Together
with detected keypoints in the images, this network yields good estimates of
the 3D pose. We introduce a large scale 3D hand pose dataset based on synthetic
hand models for training the involved networks. Experiments on a variety of
test sets, including one on sign language recognition, demonstrate the
feasibility of 3D hand pose estimation on single color images.Comment: Accepted to ICCV 2017. Code and dataset is released:
https://lmb.informatik.uni-freiburg.de/projects/hand3d
Geodesic Distance Histogram Feature for Video Segmentation
This paper proposes a geodesic-distance-based feature that encodes global
information for improved video segmentation algorithms. The feature is a joint
histogram of intensity and geodesic distances, where the geodesic distances are
computed as the shortest paths between superpixels via their boundaries. We
also incorporate adaptive voting weights and spatial pyramid configurations to
include spatial information into the geodesic histogram feature and show that
this further improves results. The feature is generic and can be used as part
of various algorithms. In experiments, we test the geodesic histogram feature
by incorporating it into two existing video segmentation frameworks. This leads
to significantly better performance in 3D video segmentation benchmarks on two
datasets
U-Net: Convolutional Networks for Biomedical Image Segmentation
There is large consent that successful training of deep networks requires
many thousand annotated training samples. In this paper, we present a network
and training strategy that relies on the strong use of data augmentation to use
the available annotated samples more efficiently. The architecture consists of
a contracting path to capture context and a symmetric expanding path that
enables precise localization. We show that such a network can be trained
end-to-end from very few images and outperforms the prior best method (a
sliding-window convolutional network) on the ISBI challenge for segmentation of
neuronal structures in electron microscopic stacks. Using the same network
trained on transmitted light microscopy images (phase contrast and DIC) we won
the ISBI cell tracking challenge 2015 in these categories by a large margin.
Moreover, the network is fast. Segmentation of a 512x512 image takes less than
a second on a recent GPU. The full implementation (based on Caffe) and the
trained networks are available at
http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .Comment: conditionally accepted at MICCAI 201
- …
