Content-Based Image Retrieval via Combination of Similarity Measures

Kazushi Okamoto; Fangyan Dong; Shinichi Yoshida; Kaoru Hirota

doi:10.20965/jaciii.2011.p0687

single-jc.php

« previous

JACIII Vol.15 No.6 pp. 687-697

doi: 10.20965/jaciii.2011.p0687

(2011)

Paper:

Views over last 60 days: 794

Content-Based Image Retrieval via Combination of Similarity Measures

Kazushi Okamoto^, Fangyan Dong^, Shinichi Yoshida^**,
and Kaoru Hirota^*

^*Dept. of Computational Intelligence and Systems Science, Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, G3-49, 4259 Nagatsuta, Midori-ku, Yokohama 226-8502, Japan

^**School of Information, Kochi University of Technology, 185 Tosayamada-cho-Miyanokuchi, Kochi 782-8502, Japan

Received:

December 16, 2010

Accepted:

March 30, 2011

Published:

August 20, 2011

Keywords:

image retrieval, similarity measure, local feature, indexing, retrieval accuracy

Abstract

A multiple (dis)similarity measure combination framework via normalization and weighting of measures is proposed to find suitable measure combinations in terms of retrieval accuracy and computational cost. In the combination of Manhattan and Hellinger distances, the computational time is more than 12 times faster and the retrieval accuracy improves or remains at the same level, when compared with Minkowski distance, a measure having the best retrieval accuracy in the single measure scenario. These performances are determined on a visual word based image retrieval system by using the Corel collections. Due to the reduction of computational cost and robustness of retrieval accuracy in this combination, applications include retrieval employing large number of images and categories in a database.

Cite this article as:

K. Okamoto, F. Dong, S. Yoshida, and K. Hirota, “Content-Based Image Retrieval via Combination of Similarity Measures,” J. Adv. Comput. Intell. Intell. Inform., Vol.15 No.6, pp. 687-697, 2011.

Data files:

References

[1] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: image segmentation using expectation-maximization and its application to image querying,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.24, No.8, pp. 1026-1038, 2002.
[2] R. d. S. Torres, A. X. Falcão, M. A. Gonçalves, J. P. Papa, B. Zhang, W. Fan, and E. A. Fox, “A genetic programming framework for content-based image retrieval,” Pattern Recognition, Vol.42, No.2, pp. 283-292, 2009.
[3] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, “Query by image and video content: the QBIC system,” Computer, Vol.28, No.9, pp. 23-32, 1995.
[4] A. Pentland, R. W. Picard, and S. Sclaroff, “Photobook: contentbased manipulation of image databases,” Int. J. of Computer Vision, Vol.18, No.3, pp. 233-254, 1996.
[5] Z. Stejić, Y. Takama, and K. Hirota, “Relevance feedback-based image retrieval interface incorporating region and feature saliency patterns as visualizable image similarity criteria,” IEEE Trans. on Industrial Electronics, Vol.50, No.5, pp. 839-852, 2003.
[6] Z. Stejić, Y. Takama, and K. Hirota, “Mathematical aggregation operators in image retrieval: effect on retrieval performance and role in relevance feedback,” Signal Processing, Vol.85, No.2, pp. 297-324, 2005.
[7] J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: semanticssensitive integrated matching for picture libraries,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.23, No.9, pp. 947-963, 2001.
[8] D. Geman, S. Geman, C. Graffigne, and P. Dong, “Boundary detection by constrained optimization,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.12, No.7, pp. 609-628, 1990.
[9] P. Howarth and S. Rüger, “Fractional distance measures for contentbased image retrieval,” In The 27th European Conf. on IR Research, LNCS 3408, pp. 447-456, 2005.
[10] T. Kailath, “The divergence and Bhattacharyya distance measures in signal selection,” IEEE Trans. on Communication Technology, com-15, Vol.1, pp. 52-60, 1967.
[11] M. Kokare, B. N. Chatterji, and P. K. Biswas, “Comparison of similarity metrics for texture image retrieval,” In Conf. on Convergent Technologies for Asia-Pacific Region, Vol.2, pp. 571-575, 2003.
[12] J. Puzicha, T. Hofmann, and J. M. Buhmann, “Non-parametric similarity measures for unsupervised texture segmentation and image retrieval,” In IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR’97), pp. 267-272, 1997.
[13] M. J. Swain and D. H. Ballard, “Color indexing,” Int. J. of Computer Vision, Vol.7, No.1, pp. 11-32, 1991.
[14] D. Zhang and G. Lu, “Evaluation of similarity measurement for image retrieval,” In The 2003 Int. Conf. on Neural Networks and Signal Processing, Vol.2, pp. 928-931, 2003.
[15] H. Liu, D. Song, S. Rüger, R. Hu, and V. Uren, “Comparing dissimilarity measures for content-based image retrieval,” In The 4th Asian Information Retrieval Symposium on Information Retrieval Technology, LNCS 4993, pp. 44-50, 2008.
[16] K. Okamoto, F. Dong, S. Yoshida, and K. Hirota, “Content-based image retrieval via ranking of similarity measures,” In The 2010 Int. Symposium on Intelligent Systems, 2010.
[17] G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, “Visual categorization with bags of keypoints,” In Workshop on Statistical Learning in Computer Vision (ECCV 2004), pp. 1-22, 2004.
[18] G. Ding, J. Wang, and K. Qin, “A visual word weighting scheme based on emerging itemsets for video annotation,” Information Processing Letters, Vol.110, No.16, pp. 692-696, 2010.
[19] Y. G. Jiang and C. W. Ngo, “Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval,” Computer Vision and Image Understanding, Vol.113, No.3, pp. 405-414, 2009.
[20] J. Sivic and A. Zisserman, “Video Google: A text retrieval approach to object matching in videos,” In The 9th IEEE Int. Conf. on Computer Vision, Vol.2, pp. 1470-1477, 2003.
[21] G. J. Burghouts and J. M. Geusebroek, “Performance evaluation of local colour invariants,” Computer Vision and Image Understanding, Vol.113, No.1, pp. 48-62, 2009.
[22] D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful seeding,” In Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027-1035, 2007.
[23] W. Zhang, T. Yoshida, and X. Tang, “A comparative study of TF*IDF, LSI and multi-words for text classification,” Expert Systems with Applications, Vol.38, No.3, pp. 2758-2765, 2011.
[24] K. Järvelin and J. Kekäläinen, “Cumulated gain-based evaluation of IR techniques,” ACM Trans. on Information Systems, Vol.20, No.4, pp. 422-446, 2002.
[25] A. Moffat and J. Zobel, “Rank-biased precision for measurement of retrieval effectiveness,” ACM Trans. on Information Systems, Vol.27, No.1, pp. 1-27, 2008.
[26] E. M. Voorhees and D. M. Tice, “The TREC-8 question answering track evaluation,” In Text Retrieval Conf. TREC-8, pp. 83-105, 1999.
[27] F. Wilcoxon, “Individual comparisons by ranking methods,” Biometrics Bulletin, Vol.1, No.6, pp. 80-83, 1945.
[28] M. Mizumoto, “Pictorial representations of fuzzy connectives, Part I: cases of t-norms, t-conorms and averaging operators,” Fuzzy Sets and Systems, Vol.31, No.2, pp. 217-242, 1989.
[29] K. M. Donald and A. F. Smeaton, “A comparison of score, rank and probability-based fusion methods for video shot retrieval,” In The 4th Int. Conf. on Image and Video Retrieval, LNCS 3568, pp. 61-70, 2010.
[30] C. J. v. Rijsbergen, “Information retrieval,” Department of Computing Science University of Glasgow, 1979.
[31] T. Sakai and N. Kando, “On information retrieval metrics designed for evaluation with incomplete relevance assessments,” Information Retrieval, Vol.11, No.5, pp. 447-470, 2008.
[32] Y. Rubner, C. Tomasi, and L. J. Guibas, “The Earth mover’s distance as a metric for image retrieval,” Int. J. of Computer Vision, Vol.40, No.2, pp. 99-121, 2000.
[33] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. of Computer Vision, Vol.60, No.2, pp. 91-110, 2004.
[34] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: speeded up robust features,” Computer Vision and Image Understanding, Vol.110, No.3, pp. 346-359, 2008.
[35] J. Puzicha, J. M. Buhmann, Y. Rubner, and C. Tomasi, “Empirical evaluation of dissimilarity measures for color and texture,” In The 7th IEEE Int. Conf. on Computer Vision, Vol.2, pp. 1165-1172, 1999.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: image segmentation using expectation-maximization and its application to image querying,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.24, No.8, pp. 1026-1038, 2002.

[2] [2] R. d. S. Torres, A. X. Falcão, M. A. Gonçalves, J. P. Papa, B. Zhang, W. Fan, and E. A. Fox, “A genetic programming framework for content-based image retrieval,” Pattern Recognition, Vol.42, No.2, pp. 283-292, 2009.

[3] [3] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, “Query by image and video content: the QBIC system,” Computer, Vol.28, No.9, pp. 23-32, 1995.

[4] [4] A. Pentland, R. W. Picard, and S. Sclaroff, “Photobook: contentbased manipulation of image databases,” Int. J. of Computer Vision, Vol.18, No.3, pp. 233-254, 1996.

[5] [5] Z. Stejić, Y. Takama, and K. Hirota, “Relevance feedback-based image retrieval interface incorporating region and feature saliency patterns as visualizable image similarity criteria,” IEEE Trans. on Industrial Electronics, Vol.50, No.5, pp. 839-852, 2003.

[6] [6] Z. Stejić, Y. Takama, and K. Hirota, “Mathematical aggregation operators in image retrieval: effect on retrieval performance and role in relevance feedback,” Signal Processing, Vol.85, No.2, pp. 297-324, 2005.

[7] [7] J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: semanticssensitive integrated matching for picture libraries,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.23, No.9, pp. 947-963, 2001.

[8] [8] D. Geman, S. Geman, C. Graffigne, and P. Dong, “Boundary detection by constrained optimization,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.12, No.7, pp. 609-628, 1990.

[9] [9] P. Howarth and S. Rüger, “Fractional distance measures for contentbased image retrieval,” In The 27th European Conf. on IR Research, LNCS 3408, pp. 447-456, 2005.

[10] [10] T. Kailath, “The divergence and Bhattacharyya distance measures in signal selection,” IEEE Trans. on Communication Technology, com-15, Vol.1, pp. 52-60, 1967.

[11] [11] M. Kokare, B. N. Chatterji, and P. K. Biswas, “Comparison of similarity metrics for texture image retrieval,” In Conf. on Convergent Technologies for Asia-Pacific Region, Vol.2, pp. 571-575, 2003.

[12] [12] J. Puzicha, T. Hofmann, and J. M. Buhmann, “Non-parametric similarity measures for unsupervised texture segmentation and image retrieval,” In IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR’97), pp. 267-272, 1997.

[13] [13] M. J. Swain and D. H. Ballard, “Color indexing,” Int. J. of Computer Vision, Vol.7, No.1, pp. 11-32, 1991.

[14] [14] D. Zhang and G. Lu, “Evaluation of similarity measurement for image retrieval,” In The 2003 Int. Conf. on Neural Networks and Signal Processing, Vol.2, pp. 928-931, 2003.

[15] [15] H. Liu, D. Song, S. Rüger, R. Hu, and V. Uren, “Comparing dissimilarity measures for content-based image retrieval,” In The 4th Asian Information Retrieval Symposium on Information Retrieval Technology, LNCS 4993, pp. 44-50, 2008.

[16] [16] K. Okamoto, F. Dong, S. Yoshida, and K. Hirota, “Content-based image retrieval via ranking of similarity measures,” In The 2010 Int. Symposium on Intelligent Systems, 2010.

[17] [17] G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, “Visual categorization with bags of keypoints,” In Workshop on Statistical Learning in Computer Vision (ECCV 2004), pp. 1-22, 2004.

[18] [18] G. Ding, J. Wang, and K. Qin, “A visual word weighting scheme based on emerging itemsets for video annotation,” Information Processing Letters, Vol.110, No.16, pp. 692-696, 2010.

[19] [19] Y. G. Jiang and C. W. Ngo, “Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval,” Computer Vision and Image Understanding, Vol.113, No.3, pp. 405-414, 2009.

[20] [20] J. Sivic and A. Zisserman, “Video Google: A text retrieval approach to object matching in videos,” In The 9th IEEE Int. Conf. on Computer Vision, Vol.2, pp. 1470-1477, 2003.

[21] [21] G. J. Burghouts and J. M. Geusebroek, “Performance evaluation of local colour invariants,” Computer Vision and Image Understanding, Vol.113, No.1, pp. 48-62, 2009.

[22] [22] D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful seeding,” In Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027-1035, 2007.

[23] [23] W. Zhang, T. Yoshida, and X. Tang, “A comparative study of TF*IDF, LSI and multi-words for text classification,” Expert Systems with Applications, Vol.38, No.3, pp. 2758-2765, 2011.

[24] [24] K. Järvelin and J. Kekäläinen, “Cumulated gain-based evaluation of IR techniques,” ACM Trans. on Information Systems, Vol.20, No.4, pp. 422-446, 2002.

[25] [25] A. Moffat and J. Zobel, “Rank-biased precision for measurement of retrieval effectiveness,” ACM Trans. on Information Systems, Vol.27, No.1, pp. 1-27, 2008.

[26] [26] E. M. Voorhees and D. M. Tice, “The TREC-8 question answering track evaluation,” In Text Retrieval Conf. TREC-8, pp. 83-105, 1999.

[27] [27] F. Wilcoxon, “Individual comparisons by ranking methods,” Biometrics Bulletin, Vol.1, No.6, pp. 80-83, 1945.

[28] [28] M. Mizumoto, “Pictorial representations of fuzzy connectives, Part I: cases of t-norms, t-conorms and averaging operators,” Fuzzy Sets and Systems, Vol.31, No.2, pp. 217-242, 1989.

[29] [29] K. M. Donald and A. F. Smeaton, “A comparison of score, rank and probability-based fusion methods for video shot retrieval,” In The 4th Int. Conf. on Image and Video Retrieval, LNCS 3568, pp. 61-70, 2010.

[30] [30] C. J. v. Rijsbergen, “Information retrieval,” Department of Computing Science University of Glasgow, 1979.

[31] [31] T. Sakai and N. Kando, “On information retrieval metrics designed for evaluation with incomplete relevance assessments,” Information Retrieval, Vol.11, No.5, pp. 447-470, 2008.

[32] [32] Y. Rubner, C. Tomasi, and L. J. Guibas, “The Earth mover’s distance as a metric for image retrieval,” Int. J. of Computer Vision, Vol.40, No.2, pp. 99-121, 2000.

[33] [33] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. of Computer Vision, Vol.60, No.2, pp. 91-110, 2004.

[34] [34] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: speeded up robust features,” Computer Vision and Image Understanding, Vol.110, No.3, pp. 346-359, 2008.

[35] [35] J. Puzicha, J. M. Buhmann, Y. Rubner, and C. Tomasi, “Empirical evaluation of dissimilarity measures for color and texture,” In The 7th IEEE Int. Conf. on Computer Vision, Vol.2, pp. 1165-1172, 1999.

Content-Based Image Retrieval via Combination of Similarity Measures

Kazushi Okamoto*, Fangyan Dong*, Shinichi Yoshida**, and Kaoru Hirota*

Kazushi Okamoto^, Fangyan Dong^, Shinichi Yoshida^**,
and Kaoru Hirota^*