Influence of Object Detection in Deep Learning

Rui Yu; Xiangyang Xu; Zhigang Wang

doi:10.20965/jaciii.2018.p0683

single-jc.php

« previous

JACIII Vol.22 No.5 pp. 683-688

(2018)

doi: 10.20965/jaciii.2018.p0683

Paper:

Views over last 60 days: 6,824

Influence of Object Detection in Deep Learning

Rui Yu, Xiangyang Xu, and Zhigang Wang

School of Automation, Beijing Institute of Technology
No.5 Zhongguancun South Street, Haidian District, Beijing 10081, China

Received:

February 27, 2018

Accepted:

June 11, 2018

Published:

September 20, 2018

Keywords:

object detection, network structure, training dataset, accuracy

Abstract

We herein investigate the influence of object detection in deep learning. Based on using one neural network model and maintaining its primary network structure, we discuss the relationship between the detection accuracy with the scale of the training dataset and the network depth and width. We adopt the single factor experiment for each influence factor and create a test dataset including different types of object pictures. After each experiment, we first predict the average precision for the validation dataset and subsequently test the target pictures. The results of the experiment reveal that it is effective to improve the accuracy by enriching the training dataset. The more necessary features the training dataset has, the more precise are the results. Therefore, the network structure is a crucial factor, and adopting advanced models could be beneficial to obtain an excellent performance on sophisticated targets.

Cite this article as:

R. Yu, X. Xu, and Z. Wang, “Influence of Object Detection in Deep Learning,” J. Adv. Comput. Intell. Intell. Inform., Vol.22 No.5, pp. 683-688, 2018.

Data files:

References

[1] Y. Lecun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, Vol.521, No.7553, pp. 436, 2015.
[2] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Computer Vision and Pattern Recognition (cs.CV), 2014.
[3] P. N. Druzhkov and V. D. Kustikova, “A survey of deep learning methods and software tools for image classification and object detection,” Pattern Recognition and Image Analysis, Vol.26, No.1, pp. 9-15, 2016.
[4] M. D. Zeiler and R. Fergus, “Visualizing and Understanding Convolutional Networks,” D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars eds., European Conf. on Computer Vision, Springer, Cham, pp. 818-833, 2014.
[5] C. Tang et al., “The Object Detection Based on Deep Learning,” Int. Conf. on Information Science and Control Engineering, pp. 723-728, 2017.
[6] W. Burger and M. J. Burge, “Scale-Invariant Feature Transform (SIFT),” Digital Image Processing, Springer, 2016.
[7] N. Dalal, B. Triggs, and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Vol.1, No.12, pp. 886-893, 2005.
[8] J. Deng et al., “ImageNet: A large-scale hierarchical image database,” 2009 IEEE Conf. on Computer Vision and Pattern Recognition, pp. 248-255, 2009.
[9] J. Redmon et al., “You Only Look Once: Unified, Real-Time Object Detection,” IEEE Conf. on Computer Vision and Pattern Recognition, IEEE Computer Society, pp. 779-788, 2016.
[10] W. Liu et al., “SSD: Single Shot MultiBox Detector,” B. Leibe, J. Matas, N. Sebe, and M. Welling eds., European Conf. on Computer Vision, Springer, Cham, pp. 21-37, 2016.
[11] R. Girshick et al., “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” Computer Vision and Pattern Recognition, IEEE, pp. 580-587, 2014.
[12] S. Ren et al., “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.39, No.6, pp. 1137, 2017.
[13] K. He et al., “Mask R-CNN,” Computer Vision and Pattern Recognition, 2017.
[14] A. S. Razavian et al., “CNN features off-the-shelf: An Astounding Baseline for Recognition,” IEEE Computer Society, pp. 512-519, 2014.
[15] P. Turcot and D. G. Lowe, “Better matching with fewer features: The selection of useful features in large database recognition problems,” IEEE, Int. Conf. on Computer Vision Workshops, pp. 2109-2116, 2009.
[16] K. He et al., “Deep Residual Learning for Image Recognition,” 2016 IEEE Conf. on Computer Vision and Pattern Recognition, pp. 770-778, 2016
[17] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” J. of Machine Learning Research, Vol.9, pp. 249-256, 2010.
[18] X. Ning, W. Zhu, and S. Chen, “Recognition, object detection and segmentation of white background photos based on deep learning,” 2017 32nd Youth Academic Annual Conf. of Chinese Association of Automation (YAC), IEEE, pp. 182-187, 2017.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[B1] [1] Y. Lecun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, Vol.521, No.7553, pp. 436, 2015.

[B2] [2] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Computer Vision and Pattern Recognition (cs.CV), 2014.

[B3] [3] P. N. Druzhkov and V. D. Kustikova, “A survey of deep learning methods and software tools for image classification and object detection,” Pattern Recognition and Image Analysis, Vol.26, No.1, pp. 9-15, 2016.

[B4] [4] M. D. Zeiler and R. Fergus, “Visualizing and Understanding Convolutional Networks,” D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars eds., European Conf. on Computer Vision, Springer, Cham, pp. 818-833, 2014.

[B5] [5] C. Tang et al., “The Object Detection Based on Deep Learning,” Int. Conf. on Information Science and Control Engineering, pp. 723-728, 2017.

[B6] [6] W. Burger and M. J. Burge, “Scale-Invariant Feature Transform (SIFT),” Digital Image Processing, Springer, 2016.

[B7] [7] N. Dalal, B. Triggs, and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Vol.1, No.12, pp. 886-893, 2005.

[B8] [8] J. Deng et al., “ImageNet: A large-scale hierarchical image database,” 2009 IEEE Conf. on Computer Vision and Pattern Recognition, pp. 248-255, 2009.

[B9] [9] J. Redmon et al., “You Only Look Once: Unified, Real-Time Object Detection,” IEEE Conf. on Computer Vision and Pattern Recognition, IEEE Computer Society, pp. 779-788, 2016.

[B10] [10] W. Liu et al., “SSD: Single Shot MultiBox Detector,” B. Leibe, J. Matas, N. Sebe, and M. Welling eds., European Conf. on Computer Vision, Springer, Cham, pp. 21-37, 2016.

[B11] [11] R. Girshick et al., “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” Computer Vision and Pattern Recognition, IEEE, pp. 580-587, 2014.

[B12] [12] S. Ren et al., “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.39, No.6, pp. 1137, 2017.

[B13] [13] K. He et al., “Mask R-CNN,” Computer Vision and Pattern Recognition, 2017.

[B14] [14] A. S. Razavian et al., “CNN features off-the-shelf: An Astounding Baseline for Recognition,” IEEE Computer Society, pp. 512-519, 2014.

[B15] [15] P. Turcot and D. G. Lowe, “Better matching with fewer features: The selection of useful features in large database recognition problems,” IEEE, Int. Conf. on Computer Vision Workshops, pp. 2109-2116, 2009.

[B16] [16] K. He et al., “Deep Residual Learning for Image Recognition,” 2016 IEEE Conf. on Computer Vision and Pattern Recognition, pp. 770-778, 2016

[B17] [17] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” J. of Machine Learning Research, Vol.9, pp. 249-256, 2010.

[B18] [18] X. Ning, W. Zhu, and S. Chen, “Recognition, object detection and segmentation of white background photos based on deep learning,” 2017 32nd Youth Academic Annual Conf. of Chinese Association of Automation (YAC), IEEE, pp. 182-187, 2017.