749 research outputs found

    Efficient moving point handling for incremental 3D manifold reconstruction

    Get PDF
    As incremental Structure from Motion algorithms become effective, a good sparse point cloud representing the map of the scene becomes available frame-by-frame. From the 3D Delaunay triangulation of these points, state-of-the-art algorithms build a manifold rough model of the scene. These algorithms integrate incrementally new points to the 3D reconstruction only if their position estimate does not change. Indeed, whenever a point moves in a 3D Delaunay triangulation, for instance because its estimation gets refined, a set of tetrahedra have to be removed and replaced with new ones to maintain the Delaunay property; the management of the manifold reconstruction becomes thus complex and it entails a potentially big overhead. In this paper we investigate different approaches and we propose an efficient policy to deal with moving points in the manifold estimation process. We tested our approach with four sequences of the KITTI dataset and we show the effectiveness of our proposal in comparison with state-of-the-art approaches.Comment: Accepted in International Conference on Image Analysis and Processing (ICIAP 2015

    Mesh-based 3D Textured Urban Mapping

    Get PDF
    In the era of autonomous driving, urban mapping represents a core step to let vehicles interact with the urban context. Successful mapping algorithms have been proposed in the last decade building the map leveraging on data from a single sensor. The focus of the system presented in this paper is twofold: the joint estimation of a 3D map from lidar data and images, based on a 3D mesh, and its texturing. Indeed, even if most surveying vehicles for mapping are endowed by cameras and lidar, existing mapping algorithms usually rely on either images or lidar data; moreover both image-based and lidar-based systems often represent the map as a point cloud, while a continuous textured mesh representation would be useful for visualization and navigation purposes. In the proposed framework, we join the accuracy of the 3D lidar data, and the dense information and appearance carried by the images, in estimating a visibility consistent map upon the lidar measurements, and refining it photometrically through the acquired images. We evaluate the proposed framework against the KITTI dataset and we show the performance improvement with respect to two state of the art urban mapping algorithms, and two widely used surface reconstruction algorithms in Computer Graphics.Comment: accepted at iros 201

    Multi-View Stereo with Single-View Semantic Mesh Refinement

    Get PDF
    While 3D reconstruction is a well-established and widely explored research topic, semantic 3D reconstruction has only recently witnessed an increasing share of attention from the Computer Vision community. Semantic annotations allow in fact to enforce strong class-dependent priors, as planarity for ground and walls, which can be exploited to refine the reconstruction often resulting in non-trivial performance improvements. State-of-the art methods propose volumetric approaches to fuse RGB image data with semantic labels; even if successful, they do not scale well and fail to output high resolution meshes. In this paper we propose a novel method to refine both the geometry and the semantic labeling of a given mesh. We refine the mesh geometry by applying a variational method that optimizes a composite energy made of a state-of-the-art pairwise photo-metric term and a single-view term that models the semantic consistency between the labels of the 3D mesh and those of the segmented images. We also update the semantic labeling through a novel Markov Random Field (MRF) formulation that, together with the classical data and smoothness terms, takes into account class-specific priors estimated directly from the annotated mesh. This is in contrast to state-of-the-art methods that are typically based on handcrafted or learned priors. We are the first, jointly with the very recent and seminal work of [M. Blaha et al arXiv:1706.08336, 2017], to propose the use of semantics inside a mesh refinement framework. Differently from [M. Blaha et al arXiv:1706.08336, 2017], which adopts a more classical pairwise comparison to estimate the flow of the mesh, we apply a single-view comparison between the semantically annotated image and the current 3D mesh labels; this improves the robustness in case of noisy segmentations.Comment: {\pounds}D Reconstruction Meets Semantic, ICCV worksho

    ReConvNet: Video Object Segmentation with Spatio-Temporal Features Modulation

    Full text link
    We introduce ReConvNet, a recurrent convolutional architecture for semi-supervised video object segmentation that is able to fast adapt its features to focus on any specific object of interest at inference time. Generalization to new objects never observed during training is known to be a hard task for supervised approaches that would need to be retrained. To tackle this problem, we propose a more efficient solution that learns spatio-temporal features self-adapting to the object of interest via conditional affine transformations. This approach is simple, can be trained end-to-end and does not necessarily require extra training steps at inference time. Our method shows competitive results on DAVIS2016 with respect to state-of-the art approaches that use online fine-tuning, and outperforms them on DAVIS2017. ReConvNet shows also promising results on the DAVIS-Challenge 2018 winning the 1010-th position.Comment: CVPR Workshop - DAVIS Challenge 201

    On the Development of a Generic Multi-Sensor Fusion Framework for Robust Odometry Estimation

    Get PDF
    In this work we review the design choices, the mathematical and software engineering techniques employed in the development of the ROAMFREE sensor fusion library, a general, open-source framework for pose tracking and sensor parameter self-calibration in mobile robotics. In ROAMFREE, a comprehensive logical sensor library allows to abstract from the actual sensor hardware and processing while preserving model accuracy thanks to a rich set of calibration parameters, such as biases, gains, distortion matrices and geometric placement dimensions. The modular formulation of the sensor fusion problem, which is based on state-of-the-art factor graph inference techniques, allows to handle arbitrary number of multi-rate sensors and to adapt to virtually any kind of mobile robot platform, such as Ackerman steering vehicles, quadrotor unmanned aerial vehicles, omni-directional mobile robots. Different solvers are available to target high-rate online pose tracking tasks and offline accurate trajectory smoothing and parameter calibration. The modularity, versatility and out-of-the-box functioning of the resulting framework came at the cost of an increased complexity of the software architecture, with respect to an ad-hoc implementation of a platform dependent sensor fusion algorithm, and required careful design of abstraction layers and decoupling interfaces between solvers, state variables representations and sensor error models. However, we review how a high level, clean, C++/Python API, as long as ROS interface nodes, hide the complexity of sensor fusion tasks to the end user, making ROAMFREE an ideal choice for new, and existing, mobile robot projects

    Attention Mechanisms for Object Recognition with Event-Based Cameras

    Full text link
    Event-based cameras are neuromorphic sensors capable of efficiently encoding visual information in the form of sparse sequences of events. Being biologically inspired, they are commonly used to exploit some of the computational and power consumption benefits of biological vision. In this paper we focus on a specific feature of vision: visual attention. We propose two attentive models for event based vision: an algorithm that tracks events activity within the field of view to locate regions of interest and a fully-differentiable attention procedure based on DRAW neural model. We highlight the strengths and weaknesses of the proposed methods on four datasets, the Shifted N-MNIST, Shifted MNIST-DVS, CIFAR10-DVS and N-Caltech101 collections, using the Phased LSTM recognition network as a baseline reference model obtaining improvements in terms of both translation and scale invariance.Comment: WACV2019 camera-ready submissio

    Modular development of mobile robots with open source hardware and software components

    Get PDF
    Prototyping and engineering robot hardware and low-level control often require time and efforts thus subtracted to core research activities, such as SLAM or planning algorithms development, which need a working, reliable, platform to be evaluated in a real world scenario. In this paper, we present Rapid Robot Prototyping (R2P), an open source, hardware and software architecture for the rapid prototyping of robotic applications, where off-the-shelf embedded modules (e.g., sensors, actuators, and controllers) are combined together in a plug-and-play fashion, enabling the implementation of a complex system in a simple and modular way. R2P makes people involved in robotics, from researchers and designers to students and hobbyists, dramatically reduce the time and efforts required to build a robot prototype

    Robust Moving Objects Detection in Lidar Data Exploiting Visual Cues

    Get PDF
    Detecting moving objects in dynamic scenes from sequences of lidar scans is an important task in object tracking, mapping, localization, and navigation. Many works focus on changes detection in previously observed scenes, while a very limited amount of literature addresses moving objects detection. The state-of-the-art method exploits Dempster-Shafer Theory to evaluate the occupancy of a lidar scan and to discriminate points belonging to the static scene from moving ones. In this paper we improve both speed and accuracy of this method by discretizing the occupancy representation, and by removing false positives through visual cues. Many false positives lying on the ground plane are also removed thanks to a novel ground plane removal algorithm. Efficiency is improved through an octree indexing strategy. Experimental evaluation against the KITTI public dataset shows the effectiveness of our approach, both qualitatively and quantitatively with respect to the state- of-the-art
    corecore