Stereo and KinectFusion for continuous 3D reconstruction and visual odometry

Stereo and KinectFusion for continuous 3D reconstruction and visual odometry

Robust and accurate 3D reconstruction of a scene is essential for many robotic and computer vision applications. Although recent studies propose accurate reconstruction algorithms, they are only suitable for indoor operation. We are proposing a system solution that can accurately reconstruct the scene both indoors and outdoors, in real time. The system utilizes both active and passive visual sensors in conjunction with peripheral hardware for communication and suggests an accuracy improvement in both reconstruction and pose estimation accuracy over state-of-the-art SLAM algorithms via stereo visual odometry integration. We also introduce the concept of multisession reconstruction, which is relevant for many real-world applications. In our solution to this concept, distinct regions in a scene can be reconstructed in detail in separate sessions using the KinectFusion framework and merged into a global scene using continuous visual odometry camera tracking.

___

  • [1] Klein G, Murray D. Parallel tracking and mapping for small AR workspaces. In: Proceedings of the Sixth IEEE and ACM International Symposium on Mixed and Augmented Reality; November 2007; Nara, Japan.
  • [2] Newcombe RA, Lovegrove SJ, Davison AJ. DTAM: Dense tracking and mapping in real-time. In: International Conference on Computer Vision; November 2011. pp. 2320-2327.
  • [3] Newcombe RA, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison AJ, Kohli P, Shotton J, Hodges S, Fitzgibbon A. KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of the 2011 10th IEEE International Symposium on Mixed and Augmented Reality; 2011; Washington, DC, USA. pp. 127-136.
  • [4] Huang AS, Bachrach A, Henry P, Krainin M, Maturana D, Fox D, Roy N. Visual odometry and mapping for autonomous flight using an RGB-D camera. In: International Symposium on Robotics Research; August 2011; Flagstaff, AZ, USA.
  • [5] Pirker K, R¨uther M, Schweighofer G, Bischof H. GPSlam: Marrying sparse geometric and dense probabilistic visual mapping. In: Proceedings of the British Machine Vision Conference; 2011. pp. 115.1-115.12.
  • [6] Geiger A, Ziegler J, Stiller C. StereoScan: Dense 3d reconstruction in real-time. In: Intelligent Vehicles Symposium; 2011.
  • [7] Endres F, Hess J, Engelhard N, Sturm J, Cremers D, Burgard W. An evaluation of the RGB-D SLAM system. In: Proceedings of the IEEE International Conference on Robotics and Automation; May 2012; St. Paul, MN, USA.
  • [8] Harris CG, Pike JM. 3D positional integration from image sequences. In: Proceedings of the Third Alvey Vision Conference; 1987. pp. 233-236.
  • [9] Tomasi C, Kanade T. Shape and motion from image streams under orthography: a factorization method. Int J Comput Vis 1992; 9: 137-154.
  • [10] Steinbruecker F, Sturm J, Cremers D. Real-time visual odometry from dense RGB-D images. In: Workshop on Live Dense Reconstruction with Moving Cameras at the International Conference on Computer Vision; November 2011.
  • [11] Whelan T, McDonald J, Kaess M, Fallon M, Johannsson H, Leonard J. Kintinuous: spatially extended KinectFusion. In: 3rd RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras; July 2012; Sydney, Australia.
  • [12] Meilland M, Comport AI. On unifying key-frame and voxel-based dense visual slam at large scales. In: IEEE International Conference on Intelligent Robots and Systems; 2013.
  • [13] Chen J, Bautembach D, Izadi S. Scalable real-time volumetric surface reconstruction. In: ACM Transactions on Graphics 2013; 32: 113:1-113:8.
  • [14] Roth H, Vona M. Moving volume KinectFusion. In: British Machine Vision Conference; September 2012; Surrey, UK.
  • [15] Whelan T, Johannsson H, Kaess M, Leonard JJ, McDonald JB. Robust real-time visual odometry for dense RGB-D mapping. In: IEEE Internatinoal Conference on Robotics and Automation; May 2013; Karlsruhe, Germany.
  • [16] Davison AJ, Reid ID, Molton ND, Stasse O. Monoslam: Real-time single camera slam. PAMI 2007; 29: 1052-1067.
  • [17] Audras C, Comport AI, Meilland M, Rives P. Real-time dense RGB-D localisation and mapping. In: Australian Conference on Robotics and Automation; December 2011; Monash University, Australia.
  • [18] Steinbruecker F, Sturm J, Cremers D. Real-time visual odometry from dense RGB-D images. In: Workshop on Live Dense Reconstruction with Moving Cameras at the International Conference on Computer Vision; November 2011.
  • [19] Henry P, Krainin M, Herbst E, Ren X, Fox D. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. Int J Robot Res 2012; 31: 647-663.
  • [20] Scaramuzza D, Fraundorfer F, Visual odometry. IEEE Robot Autom Mag 2011; 18: 80-92.
  • [21] Geiger A, Roser M, Urtasun R. Ef?cient large-scale stereo matching. In: ACCV; 2010.
  • [22] Rusu RB, Cousins S. 3D is here: Point cloud library (PCL). In: IEEE International Conference on Robotics and Automation; May 2011; Shanghai, China.
  • [23] Zhang Q, Ye M, Yang R, Matsushita Y, Wilburn B, Yu H. Edge-preserving photometric stereo via depth fusion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2012.
  • [24] Chiu WC, Blanke U, Fritz M. Improving the Kinect by cross-modal stereo. In: BMVC; 2011.
  • [25] Kim YM, Theobalt C, Diebel J, Kosecka J, Micusik B, and Thrun S. Multi-view image and TOF sensor fusion for dense 3D reconstruction. In: Proceedings of 3DIM; 2009.
  • [26] Zhu J, Wang L, Yang R, Davis J. Fusion of time-of-flight depth and stereo for high accuracy depth maps. In: CVPR; 2008.
  • [27] Wang Y, Jia Y. A fusion framework of stereo vision and Kinect for high quality dense depth maps. In: Proceedings of the 11th International Conference on Computer Vision, ACCV Workshops; 2012. pp. 109-120.
  • [28] Furukawa, Y, Curless B, Seitz SM, Szeliski R. Towards internet-scale multiview stereo. In: CVPR; 2010.
  • [29] Kitt B, Geiger A, Lategahn H. Visual odometry based on stereo image sequences with RANSAC- based outlier rejection scheme. In: IV; 2010.
  • [30] Angeli A, Filliat D, Doncieux S, Meyer JA. Real-time visual loop-closure detection. In: IEEE International Conference on Robotics and Automation; 2008. pp. 1842-1847.