An Improved Algorithm for Detection and Pose Estimation of Texture-Less Objects

Jian Peng; Ya Su

doi:10.20965/jaciii.2021.p0204

single-jc.php

« previous

JACIII Vol.25 No.2 pp. 204-212

doi: 10.20965/jaciii.2021.p0204

(2021)

Paper:

Views over last 60 days: 994

An Improved Algorithm for Detection and Pose Estimation of Texture-Less Objects

Jian Peng^*,** and Ya Su^*,**

^*School of Automation, China University of Geosciences
388 Lumo Road, Hongshan, Wuhan, Hubei 430074, China

^**Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems
388 Lumo Road, Hongshan, Wuhan, Hubei 430074, China

Received:

September 29, 2020

Accepted:

December 22, 2020

Published:

March 20, 2021

Keywords:

computer vision, object detection and pose estimation, LineMOD algorithm

Abstract

This paper introduces an improved algorithm for texture-less object detection and pose estimation in industrial scenes. In the template training stage, a multi-scale template training method is proposed to improve the sensitivity of LineMOD to template depth. When this method performs template matching, the test image is first divided into several regions, and then training templates with similar depth are selected according to the depth of each test image region. In this way, without traversing all the templates, the depth of the template used by the algorithm during template matching is kept close to the depth of the target object, which improves the speed of the algorithm while ensuring that the accuracy of recognition will not decrease. In addition, this paper also proposes a method called coarse positioning of objects. The method avoids a lot of useless matching operations, and further improves the speed of the algorithm. The experimental results show that the improved LineMOD algorithm in this paper can effectively solve the algorithm’s template depth sensitivity problem.

We proposed the multi-scale template training method and a method called coarse positioning of objects

Cite this article as:

J. Peng and Y. Su, “An Improved Algorithm for Detection and Pose Estimation of Texture-Less Objects,” J. Adv. Comput. Intell. Intell. Inform., Vol.25 No.2, pp. 204-212, 2021.

Data files:

References

[1] T. P. Caudell and D. W. Mizell, “Augmented reality: An application of heads-up display technology to manual manufacturing processes,” Proc. of the 25th Hawaii Int. Conf. on System Sciences, Vol.2, pp. 659-669, 1992.
[2] Y. Guo, M. Bennamoun, F. Sohel et al., “A Comprehensive Performance Evaluation of 3D Local Feature Descriptors,” Int. J. of Computer Vision, Vol.116, No.1, pp. 66-89, 2016.
[3] R. B. Rusu, G. Bradski, R. Thibaux et al., “Fast 3D recognition and pose using the Viewpoint Feature Histogram,” Int. Conf. on Intelligent Robots and Systems, pp. 2155-2162, 2010.
[4] E Brachmann, A Krull, F. Michel et al., “Learning 6D Object Pose Estimation Using 3D Object Coordinates,” European Conf. on Computer Vision, pp. 536-551, 2014.
[5] S. Hinterstoisser, S. Holzer, C. Cagniart et al., “Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes,” Int. Conf. on Computer Vision, pp. 858-865, 2011.
[6] D. G. Lowe, “Object recognition from local scale-invariant features,” Proc. of Int. Conf. on Computer Vision, Vol.2, pp. 1150-1157, 1999.
[7] R. Sun, J. Qian, R. H. Jose et al., “A Flexible and Efficient Real-Time ORB-Based Full-HD Image Feature Extraction Accelerator,” IEEE Trans. on Very Large Scale Integration (VLSI) Systems, Vol.28, No.2, pp. 565-575, 2020.
[8] E. Rublee, V. Rabaud, K. Konolige et al., “ORB: An efficient alternative to SIFT or SURF,” Int. Conf. on Computer Vision, pp. 2564-2571, 2011.
[9] Y. Ren, C. Zhu, and S. Xiao, “Object Detection Based on Fast/Faster RCNN Employing Fully Convolutional Architectures,” Mathematical Problems in Engineering, Vol.2018, Article ID 3598316, 2018.
[10] S. Ren, K. He, R. Girshick et al., “Faster R-CNN: Towards real-time object detection with region proposal network,” Advances in Neural Information Processing Systems, Vol.39, No.6, pp. 91-99, 2015.
[11] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 6517-6525, 2017.
[12] J. Redmon, S. Divvala, R. Girshick et al., “You only look once: Unified, real-time object detection,” Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 779-788, 2016.
[13] W. Liu, D. Anguelov, D. Erhan et al., “SSD: Single shot multibox detecton,” European Conf. on Computer Vision, pp. 21-37, 2016.
[14] W. Kehl, F. Manhardt, F. Tombari et al., “SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again,” Proc. of the IEEE Int. Conf. on Computer Vision, pp. 1530-1538, 2017.
[15] B. Tekin, S. N. Sinha, and P. Fua, “Real-time seamless single shot 6D object pose prediction,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 292-301, 2018.
[16] V. Lepetit, F. Moreno-Noguer, and P. Fua, “EPnP: An Accurate O(n) Solution to the PnP Problem,” Int. J. of Computer Vision, Vol.81, No.2, Article No.155, 2009.
[17] B. Drost, M. Ulrich, N. Navab et al., “Model globally, match locally: Efficient and robust 3D object recognition,” IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 998-1005, 2010.
[18] S. Hinterstoisser, C. Cagniart, S. Ilic et al., “Gradient response maps for real-time detection of textureless objects,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.34, No.5, pp. 876-888, 2012.
[19] J. Lee, M. Lee, S. Kang et al., “Real-time 3D Pose Estimation of Small Ring-Shaped Bin-Picking Objects Using Deep Learning and ICP Algorithm,” J. of Institute of Control Robotics and Systems, Vol.25, No.9, pp. 760-769, 2019.
[20] G. Fanelli, J. Gall, and L. Van Gool, “Real time head pose estimation with random regression forests,” CVPR 2011, pp. 617-624, 2011.
[21] G. Fanelli, M. Dantone, J. Gall et al., “Random Forests for Real Time 3D Face Analysis,” Int. J. Comput. Vis., Vol.101, No.3, pp. 437-458, 2013.
[22] H. Zhang and Q. Cao, “Texture-less object detection and 6D pose estimation in RGB-D images,” Robotics and Autonomous Systems, Vol.95, pp. 64-79, 2017.
[23] A. Aldoma, F. Tombari, R. B. Rusu et al., “OUR-CVFH – Oriented, Unique and Repeatable Clustered Viewpoint Feature Histogram for Object Recognition and 6DOF Pose Estimation,” Joint 34th DAGM and 36th OAGM Symp. Proc., pp. 113-122, 2012.
[24] A. Tejani, D. Tang, R Kouskouridas et al., “Latent-class hough forests for 3D object detection and pose estimation,” European Conf. on Computer Vision, pp. 462-477, 2014.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] T. P. Caudell and D. W. Mizell, “Augmented reality: An application of heads-up display technology to manual manufacturing processes,” Proc. of the 25th Hawaii Int. Conf. on System Sciences, Vol.2, pp. 659-669, 1992.

[2] [2] Y. Guo, M. Bennamoun, F. Sohel et al., “A Comprehensive Performance Evaluation of 3D Local Feature Descriptors,” Int. J. of Computer Vision, Vol.116, No.1, pp. 66-89, 2016.

[3] [3] R. B. Rusu, G. Bradski, R. Thibaux et al., “Fast 3D recognition and pose using the Viewpoint Feature Histogram,” Int. Conf. on Intelligent Robots and Systems, pp. 2155-2162, 2010.

[4] [4] E Brachmann, A Krull, F. Michel et al., “Learning 6D Object Pose Estimation Using 3D Object Coordinates,” European Conf. on Computer Vision, pp. 536-551, 2014.

[5] [5] S. Hinterstoisser, S. Holzer, C. Cagniart et al., “Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes,” Int. Conf. on Computer Vision, pp. 858-865, 2011.

[6] [6] D. G. Lowe, “Object recognition from local scale-invariant features,” Proc. of Int. Conf. on Computer Vision, Vol.2, pp. 1150-1157, 1999.

[7] [7] R. Sun, J. Qian, R. H. Jose et al., “A Flexible and Efficient Real-Time ORB-Based Full-HD Image Feature Extraction Accelerator,” IEEE Trans. on Very Large Scale Integration (VLSI) Systems, Vol.28, No.2, pp. 565-575, 2020.

[8] [8] E. Rublee, V. Rabaud, K. Konolige et al., “ORB: An efficient alternative to SIFT or SURF,” Int. Conf. on Computer Vision, pp. 2564-2571, 2011.

[9] [9] Y. Ren, C. Zhu, and S. Xiao, “Object Detection Based on Fast/Faster RCNN Employing Fully Convolutional Architectures,” Mathematical Problems in Engineering, Vol.2018, Article ID 3598316, 2018.

[10] [10] S. Ren, K. He, R. Girshick et al., “Faster R-CNN: Towards real-time object detection with region proposal network,” Advances in Neural Information Processing Systems, Vol.39, No.6, pp. 91-99, 2015.

[11] [11] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 6517-6525, 2017.

[12] [12] J. Redmon, S. Divvala, R. Girshick et al., “You only look once: Unified, real-time object detection,” Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 779-788, 2016.

[13] [13] W. Liu, D. Anguelov, D. Erhan et al., “SSD: Single shot multibox detecton,” European Conf. on Computer Vision, pp. 21-37, 2016.

[14] [14] W. Kehl, F. Manhardt, F. Tombari et al., “SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again,” Proc. of the IEEE Int. Conf. on Computer Vision, pp. 1530-1538, 2017.

[15] [15] B. Tekin, S. N. Sinha, and P. Fua, “Real-time seamless single shot 6D object pose prediction,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 292-301, 2018.

[16] [16] V. Lepetit, F. Moreno-Noguer, and P. Fua, “EPnP: An Accurate O(n) Solution to the PnP Problem,” Int. J. of Computer Vision, Vol.81, No.2, Article No.155, 2009.

[17] [17] B. Drost, M. Ulrich, N. Navab et al., “Model globally, match locally: Efficient and robust 3D object recognition,” IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 998-1005, 2010.

[18] [18] S. Hinterstoisser, C. Cagniart, S. Ilic et al., “Gradient response maps for real-time detection of textureless objects,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.34, No.5, pp. 876-888, 2012.

[19] [19] J. Lee, M. Lee, S. Kang et al., “Real-time 3D Pose Estimation of Small Ring-Shaped Bin-Picking Objects Using Deep Learning and ICP Algorithm,” J. of Institute of Control Robotics and Systems, Vol.25, No.9, pp. 760-769, 2019.

[20] [20] G. Fanelli, J. Gall, and L. Van Gool, “Real time head pose estimation with random regression forests,” CVPR 2011, pp. 617-624, 2011.

[21] [21] G. Fanelli, M. Dantone, J. Gall et al., “Random Forests for Real Time 3D Face Analysis,” Int. J. Comput. Vis., Vol.101, No.3, pp. 437-458, 2013.

[22] [22] H. Zhang and Q. Cao, “Texture-less object detection and 6D pose estimation in RGB-D images,” Robotics and Autonomous Systems, Vol.95, pp. 64-79, 2017.

[23] [23] A. Aldoma, F. Tombari, R. B. Rusu et al., “OUR-CVFH – Oriented, Unique and Repeatable Clustered Viewpoint Feature Histogram for Object Recognition and 6DOF Pose Estimation,” Joint 34th DAGM and 36th OAGM Symp. Proc., pp. 113-122, 2012.

[24] [24] A. Tejani, D. Tang, R Kouskouridas et al., “Latent-class hough forests for 3D object detection and pose estimation,” European Conf. on Computer Vision, pp. 462-477, 2014.

An Improved Algorithm for Detection and Pose Estimation of Texture-Less Objects

Jian Peng*,** and Ya Su*,**

Jian Peng^*,** and Ya Su^*,**