首页 > 论文 > 激光与光电子学进展 > 53卷 > 6期(pp:60003--1)

RGB-D图像分类方法研究综述

Review on RGB-D Image Classification

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

采用新型3D传感器能够便捷地同时获取多场景、多视觉和多目标彩色和深度信息的RGB-D图像,利用其在物体重叠和遮挡下深度信息对颜色和亮度的不变特点,有效提高RGB-D图像分类的精度。对微软Kinect设备的发展及原理做详细介绍;介绍了现有的RGB-D数据集;对现有RGB-D图像特征提取与分类方法进行了归纳、分析和比较;阐述RGB-D图像分类的发展趋势。

Abstract

The color and depth information of multi-scenario, multi-vision and multiple target in the RGB-D images are conveniently obtained using a new 3D sensor at the same time. The RGB-D image classification accuracy is effectively improved using the depth information invariant characteristics of color and brightness, when the objects overlap and occlusion occurs. The development and theory of Microsoft Kinect are introduced in detail, and the existing RGB-D datasets are described. Then the feature extraction and classification methods are summarized, analyzed and compared. The development trend of RGB-D image classification is discussed.

广告组1 - 空间光调制器+DMD
补充资料

中图分类号:TP391

DOI:10.3788/lop53.060003

所属栏目:综述

基金项目:广东省科技计划(2015A020209148,2015A020224038,2015A020209124,2016A050502050)

收稿日期:2016-01-04

修改稿日期:2016-02-25

网络出版日期:2016-06-01

作者单位    点击查看

涂淑琴:华南农业大学数学与信息学院, 广东 广州 510642
薛月菊:华南农业大学电子工程学院, 广东 广州 510642
梁云:华南农业大学数学与信息学院, 广东 广州 510642
黄宁:华南农业大学电子工程学院, 广东 广州 510642
张晓:华南农业大学电子工程学院, 广东 广州 510642

联系人作者:涂淑琴(tushuqin@163.com)

备注:涂淑琴(1978-),女,博士研究生,主要从事图像场景分类和目标识别方面的研究。

【1】Bo L, Ren X, Fox D. Unsupervised feature learning for RGB-D based object recognition[J]. Springer Tracts in Advanced Robotics, 2013, 88: 387-402.

【2】Lai K, Bo L, Ren X, et al.. Sparse distance learning for object recognition combining RGB and depth information[C]. Robotics and Automation, International Conference on IEEE, 2011: 4007-4013.

【3】Bo L, Ren X, Fox D. Depth kernel descriptors for object recognition[C]. Intelligent Robots and Systems (IROS), International Conference on IEEE, 2011: 821-826.

【4】Bium M, Springenberg J T, Wulfing J, et al.. A learned feature descriptor for object recognition in RGB-D data[C]. Proceedings of IEEE International Conference on Robotics and Automation, 2012: 1298-1303.

【5】Kramer J, Burrus N, Echtler F, et al.. Hardware[J]. Hacking the Kinect, 2012, 14(2): 156-156.

【6】Socher R, Huval B, Bath B, et al.. Convolutional-recursive deep learning for 3d object classification[C]. Advances in Neural Information Processing Systems, 2012: 665-673.

【7】Couprie C, Farabet C, Najman L, et al.. Indoor semantic segmentation using depth information[C]. International Conference on Learning Representations, Scottsdale, Arizona, 2013.

【8】Farabet C, Couprie C, Najamn L, et al.. Learning hierarchical features for scene labeling[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2013, 35(8): 1915-1929.

【9】Gupta S, Girshick R, Pablo A, et al.. Learning rich features from RGB-D images for object detection and segmentation[C]. European Conference on Computer Vision, Zurich, Switzerland, 2014: 345-360.

【10】Rusu R B, Cousins S. 3D is here: Point cloud library (PCL)[C]. Robotics and Automation (ICRA), International Conference on IEEE, 2011: 1-4.

【11】Rusu R B, Blodow N, Marton Z C, et al.. Aligning point cloud views using persistent feature histograms[C]. Intelligent Robots and Systems, IROS 2008, IEEE/RSJ International Conference on IEEE, 2008: 3384-391.

【12】Tang J, Miller S, Singh A, et al.. A textured object recognition pipeline for color and depth image data[C]. Robotics and Automation (ICRA), 2012 IEEE International Conference on IEEE, 2012: 3467-3474.

【13】Rusu R B, Bradski G, Thibaux R, et al.. Fast 3D recognition and pose using the viewpoint feature histogram[C]. Intelligent Robots and Systems (IROS), International Conference on IEEE, 2010: 2155-2162.

【14】Wohlkinger W, Vincze M. Ensemble of shape functions for 3d object classification[C]. 2011 IEEE International Conference on Robotics and Biomimetics, 2011: 2987-2992.

【15】Kanezak A, Marton Z, Pangercic D, et al.. Voxelized shape and color histograms for RGB-D[C]. IROS Workshop on Active Semantic Perception, 2011.

【16】Choi C, Christensen H I. 3D pose estimation of daily objects using an RGB-D camera[C]. Intelligent Robots and Systems (IROS), International Conference on IEEE, 2012: 3342-3349.

【17】Tombari F, Salti S, Stefano L D. Acombined texture-shape descriptor for enhanced 3D feature matching[C]. Image Processing (ICIP), 2011 18th IEEE International Conference on 2011, 2011: 809-812.

【18】Rusu R B, Blodow N, Beetz M. Fast point feature histograms (FPFH) for 3D registration[C]. Proceedings of the IEEE international conference on Robotics and Automation IEEE Press, 2009: 3212-3217.

【19】Wohlkinger W, Vincze M. Ensemble of shape functions for 3d object classification[C]. IEEE International Conference on Robotics and Biomimetics (ROBIO), 2011: 2987-2992.

【20】Wang W, Chen L, Liu Z, et al.. Textured/textureless object recognition and pose estimation using RGB-D image[J]. Journal of Real-Time Image Processing, 2013: 1-16.

【21】Nascimento E R, Oliveira G L, Campos M F M, et al.. BRAND: A robust appearance and depth descriptor for RGB-D images[C]. Intelligent Robots and Systems (IROS), International Conference on IEEE, 2012: 1720-1726.

【22】Gupta S, Arbelaez P, Malik J. Perceptualorganization and recognition of indoor scenes from RGB-D images[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2013: 564-571.

【23】Lai K, B O L, REN X, et al.. A large-scale hierarchical multi-view RGB-D object dataset[C]. Robotics and Automation (ICRA), International Conference on IEEE, 2011: 1817-1824.

【24】Janoch A, Karayev S, Jia Y, et al.. A category-level 3d object dataset: Putting the Kinect to work[M]. London: Springer, 2013: 141-165.

【25】Silberman N, Hoiem D, Kohli P, et al.. Indoor segmentation and support inference from RGBD images[M]. Heidelberg: Springer, 2012: 746-760.

【26】Hema S K, Abhishek A, Joachims T, et al.. Semantic labeling of 3D point clouds for indoor scenes[J]. Nips, 2011: 244-252.

【27】Xiao J, Owens A, TorralbaA. SUN3D: Adatabase of big spaces reconstructed using SfM and object labels[C]. IEEE International Conference on Computer VisionInstitute of Electrical and Electronics Engineers, 2014: 1625-1632.

【28】Song S, Lichtenberg S P, Xiao J. Sun RGB-D: A RGB-D scene understanding benchmark suite[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 567-576.

【29】Fanelli G, Dantone M, Gall J, et al.. Random forests for real time 3D face analysis[J]. International Journal of Computer Vision, 2013, 101(3): 437-458.

【30】Hinterstoisser S, Lepetit V, Ilic S, et al.. Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes[J]. Lecture Notes in Computer Science, 2012.

【31】Ali H, Shafait F, Giannakidou E, et al.. Contextual object category recognition for RGB-D scene labeling[J]. Robotics & Autonomous Systems, 2014, 62(2): 241-256.

【32】Yang C, Jang Y, Beh J, et al.. Gesture recognition using depth-based hand tracking for contactless controller application[C]. Digest of Technical Papers-IEEE International Conference on Consumer Electronics, 2012: 297-298.

【33】Schwarz M, Schulz H, Behnke S. RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features[C]. IEEE International Conference on Robotics & Automation, 2015.

【34】Zhou Wei, Liu Gang, Ma Xiaodan, et al.. Study on multi-image registration of apple tree at different growth stages[J]. Acta Optica Sinica, 2014, 34(2): 0215001.
周薇, 刘刚, 马晓丹, 等. 不同生长时期果树多源图像的配准方法研究[J]. 光学学报, 2014, 34(2): 0215001.

【35】Koppula H S, Gupta R, Saxena A. Learning human activities and object affordances from RGB-D videos[J]. International Journal of Robotics Research, 2012, 32(8): 951-970.

【36】Ni B, Wang G, Mouli P. Rgbd-hudaact: A color-depth video database for human daily activity recognition[C]. In Consumer Depth Cameras for Computer Vision, 2013: 193-208.

【37】Li Xiuzhi, Yang Ailin, Qin Baoling, et al.. Monocular camera three dimensional reconstruction based on optical flow feedback[J]. Acta Optica Sinica, 2015, 35(5): 0515001.
李秀智, 杨爱林, 秦宝岭, 等. 基于光流反馈的单目视觉三维重建[J]. 光学学报, 2015, 35(5): 0515001.

【38】Jia Songmin, Wang Ke, Li Xiuzhi, et al.. Monocular camera three dimensional reconstruction based on variation model[J]. Acta Optica Sinica, 2014, 34(4): 0415002.
贾松敏, 王可, 李秀智, 等. 基于变分模型的单目视觉三维重建方法[J]. 光学学报, 2014, 34(4): 0415002.

【39】Tu S Q, Xue Y J, Liang Y, et al.. Learning structured group sparse representation for RGB-D image classification[J]. Journal of Information and Computational Science, 2015, 12(11): 4357-4367.

【40】Huang Xiaolin, Xue Yueju, Tu Shuqin, et al.. RGB-D images classification based on compressed sensing theory[J]. Computer Applications and Software, 2014, 31(3): 195-197.
黄晓琳, 薛月菊, 涂淑琴, 等. 基于压缩感知理论的RGB-D图像分类方法[J]. 计算机应用与软件, 2014, 31(3): 195-197.

【41】Handa A, Whelan T, Mcdonald J, et al.. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM[C]. Robotics and Automation (ICRA), International Conference on IEEE, 2014: 1524-1531.

【42】Burgard W, Cremers D, Sturm J, et al.. A benchmark for the evaluation of RGB-D SLAM systems[C]. International Conference on Intelligent Robot Systems, 2012: 573-580.

【43】Shotton J, Girshick R, Fitzgibbon A, et al.. Efficient human pose estimation from single depth images[M]. London: Springer, 2013: 175-192.

【44】Hinterstoisser S, Lepetit V, Ilic S, et al.. Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes[J]. Lecture Notes in Computer Science, 2012.

【45】Wang W, Chen L, Liu Z, et al.. Textured/textureless object recognition and pose estimation using RGB-D image[J]. Journal of Real-Time Image Processing, 2013: 1-16.

【46】Ohn-Bar E, Trivedi M M. Hand gesture recognition in real time for automotive interfaces: A multimodal vision-based approach and evaluations[J]. IEEE Transactions on Intelligent Transportation Systems, 2014, 15(6): 2368-2377.

【47】Yin Panlong, Xu Guangzhu, Lei Bangjun, et al.. Review on the technique to obtain depth information using Kinect and its application to three dimensional object recognition[J]. Journal of Integration Technology, 2013: 2(6): 94-99.
尹潘龙, 徐光柱, 雷帮军, 等. Kinect下深度信息获取技术及其在三维目标识别中的应用综述[J]. 集成技术, 2013: 2(6): 94-99.

【48】Quigley M, Conley K, Gerkey B, et al.. ROS: An open-source robot operating system[C]. ICRA Workshop on Open Source Software, 2009, 3(3.2): 5.

【49】Silberman N, Fergus R. Indoor scene segmentation using a structured light sensor[C]. Computer Vision Workshops (ICCV Workshops), International Conference on IEEE, 2011: 601-608.

【50】Ren X, Bo L, Fox D. RGB-(D) scene labeling: Features and algorithm[C]. Computer Vision and Pattern Recognition, IEEE, 2012: 2759-2766.

【51】Cheng Y, Zhao X, Huang K, et al.. Semi-supervised learning for RGB-D object recognition[C]. Pattern Recognition, International Conference on IEEE, 2014.

【52】Bengio Y, Courville A, Vincent P. Representation learning: A review and new perspectives[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2013, 35(8): 1798-1828.

【53】Wang A, Lu J, Wang G, et al.. Multi-modal unsupervised feature learning for RGB-D scene labeling[M]. London: Springer, 2014: 453-467.

【54】Eitel A, Springenberg J T, Spinello L, et al.. Multimodal deep learning for robust RGB-D object recognition[C]. CVPR, 2015.

【55】Song S, Xiao J. Sliding shapes for 3D object detection in depth images[M]. London: Springer, 2014: 634-651.

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF