521 research outputs found
Attention Gated Networks: Learning to Leverage Salient Regions in Medical Images
We propose a novel attention gate (AG) model for medical image analysis that
automatically learns to focus on target structures of varying shapes and sizes.
Models trained with AGs implicitly learn to suppress irrelevant regions in an
input image while highlighting salient features useful for a specific task.
This enables us to eliminate the necessity of using explicit external
tissue/organ localisation modules when using convolutional neural networks
(CNNs). AGs can be easily integrated into standard CNN models such as VGG or
U-Net architectures with minimal computational overhead while increasing the
model sensitivity and prediction accuracy. The proposed AG models are evaluated
on a variety of tasks, including medical image classification and segmentation.
For classification, we demonstrate the use case of AGs in scan plane detection
for fetal ultrasound screening. We show that the proposed attention mechanism
can provide efficient object localisation while improving the overall
prediction performance by reducing false positives. For segmentation, the
proposed architecture is evaluated on two large 3D CT abdominal datasets with
manual annotations for multiple organs. Experimental results show that AG
models consistently improve the prediction performance of the base
architectures across different datasets and training sizes while preserving
computational efficiency. Moreover, AGs guide the model activations to be
focused around salient regions, which provides better insights into how model
predictions are made. The source code for the proposed AG models is publicly
available.Comment: Accepted for Medical Image Analysis (Special Issue on Medical Imaging
with Deep Learning). arXiv admin note: substantial text overlap with
arXiv:1804.03999, arXiv:1804.0533
Predicting Slice-to-Volume Transformation in Presence of Arbitrary Subject Motion
This paper aims to solve a fundamental problem in intensity-based 2D/3D
registration, which concerns the limited capture range and need for very good
initialization of state-of-the-art image registration methods. We propose a
regression approach that learns to predict rotation and translations of
arbitrary 2D image slices from 3D volumes, with respect to a learned canonical
atlas co-ordinate system. To this end, we utilize Convolutional Neural Networks
(CNNs) to learn the highly complex regression function that maps 2D image
slices into their correct position and orientation in 3D space. Our approach is
attractive in challenging imaging scenarios, where significant subject motion
complicates reconstruction performance of 3D volumes from 2D slice data. We
extensively evaluate the effectiveness of our approach quantitatively on
simulated MRI brain data with extreme random motion. We further demonstrate
qualitative results on fetal MRI where our method is integrated into a full
reconstruction and motion compensation pipeline. With our CNN regression
approach we obtain an average prediction error of 7mm on simulated data, and
convincing reconstruction quality of images of very young fetuses where
previous methods fail. We further discuss applications to Computed Tomography
and X-ray projections. Our approach is a general solution to the 2D/3D
initialization problem. It is computationally efficient, with prediction times
per slice of a few milliseconds, making it suitable for real-time scenarios.Comment: 8 pages, 4 figures, 6 pages supplemental material, currently under
review for MICCAI 201
Frequency Dropout: Feature-Level Regularization via Randomized Filtering
Deep convolutional neural networks have shown remarkable performance on
various computer vision tasks, and yet, they are susceptible to picking up
spurious correlations from the training signal. So called `shortcuts' can occur
during learning, for example, when there are specific frequencies present in
the image data that correlate with the output predictions. Both high and low
frequencies can be characteristic of the underlying noise distribution caused
by the image acquisition rather than in relation to the task-relevant
information about the image content. Models that learn features related to this
characteristic noise will not generalize well to new data.
In this work, we propose a simple yet effective training strategy, Frequency
Dropout, to prevent convolutional neural networks from learning
frequency-specific imaging features. We employ randomized filtering of feature
maps during training which acts as a feature-level regularization. In this
study, we consider common image processing filters such as Gaussian smoothing,
Laplacian of Gaussian, and Gabor filtering. Our training strategy is
model-agnostic and can be used for any computer vision task. We demonstrate the
effectiveness of Frequency Dropout on a range of popular architectures and
multiple tasks including image classification, domain adaptation, and semantic
segmentation using both computer vision and medical imaging datasets. Our
results suggest that the proposed approach does not only improve predictive
accuracy but also improves robustness against domain shift.Comment: 15 page
- …
