Paper:
A Quantitative Quality Measurement for Codebook in Feature Encoding Strategies
Yuki Shinomiya and Yukinobu Hoshino
Kochi University of Technology
Tosayamada, Kami, Kochi 782-8502, Japan
Nowadays, a feature encoding strategy is a general approach to represent a document, an image or audio as a feature vector. In image recognition problems, this approach treats an image as a set of partial feature descriptors. The set is then converted to a feature vector based on basis vectors called codebook. This paper focuses on a prior probability, which is one of codebook parameters and analyzes dependency for the feature encoding. In this paper, we conducted the following two experiments, analysis of prior probabilities in state-of-the-art encodings and control of prior probabilities. The first experiment investigates the distribution of prior probabilities and compares recognition performances of recent techniques. The results suggest that recognition performance probably depends on the distribution of prior probabilities. The second experiment tries further statistical analysis by controlling the distribution of prior probabilities. The results show a strong negative linear relationship between a standard deviation of prior probabilities and recognition accuracy. From these experiments, the quality of codebook used for feature encoding can be quantitatively measured, and recognition performances can be improved by optimizing codebook. Besides, the codebook is created at an offline step. Therefore, optimizing codebook does not require any additional computational cost for practical applications.
- [1] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int. J. Comput. Vision, Vol.60, No.2, pp. 91-110, November 2004.
- [2] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-Up Robust Features (SURF),” Comput. Vis. Image Underst., Vol.110, No.3, pp. 346-359, June 2008.
- [3] K. v. d. Sande, T. Gevers, and C. Snoek, “Evaluating Color Descriptors for Object and Scene Recognition,” IEEE Trans. Pattern Anal. Mach. Intell., Vol.32, No.9, pp. 1582-1596, September 2010.
- [4] L. Seidenari, G. Serra, A. D. Bagdanov, and A. D. Bimbo, “Local Pyramidal Descriptors for Image Recognition,” IEEE Trans. Pattern Anal. Mach. Intell., Vol.36, No.5, pp. 1033-1040, 2014.
- [5] F. Perronnin and C. R. Dance, “Fisher Kernels on Visual Vocabularies for Image Categorization,” IEEE Conf. on Computer Vision and Pattern Recognition 2007 (CVPR ’07), 2007.
- [6] P.-H. Gosselin, N. Murray, H. Jégou, and F. Perronnin, “Revisiting the Fisher vector for fine-grained classification,” Pattern Recognition Letters, Vol.49, pp. 92-98, November 2014.
- [7] F. Perronnin, J. Sánchez, and T. Mensink, “Improving the Fisher Kernel for Large-scale Image Classification,” Proc. of the 11th European Conf. on Computer Vision: Part IV, ECCV’10, pp. 143-156, Berlin, Heidelberg, 2010.
- [8] R. Arandjelović and A. Zisserman, “Three things everyone should know to improve object retrieval,” IEEE Conf. on Computer Vision and Pattern Recognition, 2012.
- [9] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, “LIBLINEAR: A Library for Large Linear Classification,” J. of Machine Learning Research, Vol.9, pp. 1871-1874, 2008.
- [10] J. Delhumeau, P.-H. Gosselin, H. Jégou, and P. Pérez, “Revisiting the VLAD Image Representation,” Proc. of the 21st ACM Int. Conf. on Multimedia, MM ’13, pp. 653-656, New York, NY, USA, 2013.
- [11] H. Ichihashi, K. Honda, A. Notsu, and K. Ohta, “Fuzzy c-means classifier with particle swarm optimization,” FUZZ-IEEE, pp. 207-215, 2008.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.