An Evaluation Strategy for Visual Key Image Retrieval on Mobile Devices

Kazushi Okamoto; Kazuhiko Kawamoto; Fangyan Dong; Shinichi Yoshida; Kaoru Hirota

doi:10.20965/jaciii.2012.p0713

single-jc.php

« previous

JACIII Vol.16 No.6 pp. 713-722

doi: 10.20965/jaciii.2012.p0713

(2012)

Paper:

Views over last 60 days: 784

An Evaluation Strategy for Visual Key Image Retrieval on Mobile Devices

Kazushi Okamoto^1, Kazuhiko Kawamoto^1,2, Fangyan Dong^3,
Shinichi Yoshida^4, and Kaoru Hirota^3

^*1Academic Link Center, Chiba University

^*2Institute of Media and Information Technology, Chiba University, 1-33 Yayoicho, Inage-ku, Chiba 263-8522, Japan

^*3Dept. of Computational Intelligence and Systems Science, Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, G3-49, 4259 Nagatsuta, Midori-ku, Yokohama 226-8502, Japan

^*4School of Information, Kochi University of Technology, 185 Tosayamada-cho-Miyanokuchi, Kochi 782-8502, Japan

Received:

February 20, 2012

Accepted:

June 20, 2012

Published:

September 20, 2012

Keywords:

image retrieval, retrieval accuracy, indexing, clustering, visual feature

Abstract

An evaluation strategy for visual key image retrieval systems is proposed in order to show the design criteria of a querying interface on mobile devices. Indexes (lists of visual keys) generated by different number of visual keys and visual features are validated using Art-Explosion 600,000, which contains about 300 semantic categories and over 100,000 natural photos. The result suggests that access to a collection with a visual key can provide a relevant image in rank 10 and 4 relevant images in rank 20 when the number of visual keys is 60, which is the lower limit. In portable devices, which display 16 visual keys per page, users can at least access a required image by browsing only 4 pages with 60 visual keys, and can use the image for related subsequent queries by using the other image retrieval functions.

Cite this article as:

K. Okamoto, K. Kawamoto, F. Dong, S. Yoshida, and K. Hirota, “An Evaluation Strategy for Visual Key Image Retrieval on Mobile Devices,” J. Adv. Comput. Intell. Intell. Inform., Vol.16 No.6, pp. 713-722, 2012.

Data files:

References

[1] R. da S. Torres et al., “A genetic programming framework for content-based image retrieval,” Pattern Recognition, Vol.42, No.2, pp. 283-292, 2009.
[2] J. Laaksonen, M. Koskela, S. Laakso, and E. Oja, “PicSOM content-based image retrieval with self-organizing maps,” Pattern Recognition Letters, Vol.21, Issue 13-14, pp. 1199-1207, 2000.
[3] J. Li and J. Z. Wang, “Real-time computerized annotation of pictures,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.30, Issue 6, pp. 985-1002, 2008.
[4] Z. Stejić, Y. Takama, and K. Hirota, “Relevance feedback-based image retrieval interface incorporating region and feature saliency patterns as visualizable image similarity criteria,” IEEE Trans. on Industrial Electronics, Vol.50, Issue 5, pp. 839-852, 2003.
[5] J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: Semanticssensitive integrated matching for picture libraries,” IEEE Trans. on Pattern Analysis andMachine Intelligence, Vol.23, Issue 9, pp. 947-963, 2001.
[6] J. Fauqueur and N. Boujemaa, “Mental image search by boolean composition of region categories,” Multimedia Tools and Applications, Vol.31, No.1, pp. 95-117, 2006.
[7] M. Serata, Y. Hatakeyama, and K. Hirota, “Designing Image Retrieval System with the Concept of Visual Keys,” J. of Advanced Computational Intelligence and Intelligent Informatics, Vol.10, No.2, pp. 136- 144, 2006.
[8] K. Okamoto, F. Dong, S. Yoshida, and K. Hirota, “DCT domain features based image index (in Japanese),” J. of Japan Society for Fuzzy Theory and Intelligent Informatics, Vol.21, No.6, pp. 1092-1102, 2009.
[9] G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, “Visual categorization with bags of keypoints,” In Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1-22, 2004.
[10] G. Ding, J. Wang, and K. Qin, “A visual word weighting scheme based on emerging itemsets for video annotation,” Information Processing Letters, Vol.110, Issue 16, pp. 692-696, 2010.
[11] Y. G. Jiang and C. W. Ngo, “Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval,” Computer Vision and Image Understanding, Vol.113, No.3, pp. 405-414, 2009.
[12] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” In IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1-8, 2007.
[13] J. Sivic and A. Zisserman, “Video Google: A text retrieval approach to object matching in videos,” In Ninth IEEE Int. Conf. on Computer Vision, Vol.2, pp. 1470-1477, 2003.
[14] L. Zhu, A. B. Rao, and A. Zhang, “Theory of keyblock-based image retrieval,” ACM Trans. on Information Systems, Vol.20, Issue 2, pp. 224-257, 2002.
[15] A. Hub, D. Blank, A. Henrich, and W. Müller, “Picadomo: Faceted image browsing for mobile devices,” In Seventh Int. Workshop on Content-Based Multimedia Indexing, pp. 249-254, 2009.
[16] C. J. van Rijsbergen, “Information retrieval,” Department of Computing Science University of Glasgow, 1979.
[17] E. M. Voorhees and D. M. Tice, “The TREC-8 question answering track evaluation,” In Text Retrieval Conf. TREC-8, pp. 83-105, 1999.
[18] D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful seeding,” In Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027-1035, 2007.
[19] M. Friedman, “The use of ranks to avoid the assumption of normality implicit in the analysis of variance,” J. of the American Statistical Association, Vol.32, No.200, pp. 675-701, 1937.
[20] F. Wilcoxon, “Individual comparisons by ranking methods,” Biometrics Bulletin, Vol.1, No.6, pp. 80-83, 1945.
[21] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: Image segmentation using expectation-maximization and its application to image querying,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.24, Issue 8, pp. 1026-1038, 2002.
[22] J. Fauqueur and N. Boujemaa, “Region-based image retrieval: fast coarse segmentation and fine color description,” J. of Visual Languages & Computing, Vol.15, No.1, pp. 69-95, 2004.
[23] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: Speeded up robust features,” Computer Vision and Image Understanding, Vol.110, No.3, pp. 346-359, 2008.
[24] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. of Computer Vision, Vol.60, No.2, pp. 91-110, 2004.
[25] M. Stricker and M. Orengo, “Similarity of color images,” In Storage and Retrieval for Image and Video Databases III, Vol.2420, pp. 381-392, 1995.
[26] J. Laaksonen, E. Oja, M. Koskela, and S. Brandt, “Analyzing lowlevel visual features using content-based image retrieval,” In Int. Conf. on Neural Information Processing, pp. 1333-1338, 2000.
[27] S. Brandt, J. Laaksonen, and E. Oja, “Statistical shape features for content-based image retrieval,” J. of Mathematical Imaging and Vision, Vol.17, No.2, pp. 187-198, 2002.
[28] B. S. Manjunath, J. R. Ohm, V. V. Vasudevan, and A. Yamada, “Color and texture descriptors,” IEEE Trans. on Circuits and Systems for Video Technology, Vol.11, No.6, pp. 703-715, 2001.
[29] ITU, “Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios,” ITU-R Recommendation BT.601-6, 2007.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] R. da S. Torres et al., “A genetic programming framework for content-based image retrieval,” Pattern Recognition, Vol.42, No.2, pp. 283-292, 2009.

[2] [2] J. Laaksonen, M. Koskela, S. Laakso, and E. Oja, “PicSOM content-based image retrieval with self-organizing maps,” Pattern Recognition Letters, Vol.21, Issue 13-14, pp. 1199-1207, 2000.

[3] [3] J. Li and J. Z. Wang, “Real-time computerized annotation of pictures,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.30, Issue 6, pp. 985-1002, 2008.

[4] [4] Z. Stejić, Y. Takama, and K. Hirota, “Relevance feedback-based image retrieval interface incorporating region and feature saliency patterns as visualizable image similarity criteria,” IEEE Trans. on Industrial Electronics, Vol.50, Issue 5, pp. 839-852, 2003.

[5] [5] J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: Semanticssensitive integrated matching for picture libraries,” IEEE Trans. on Pattern Analysis andMachine Intelligence, Vol.23, Issue 9, pp. 947-963, 2001.

[6] [6] J. Fauqueur and N. Boujemaa, “Mental image search by boolean composition of region categories,” Multimedia Tools and Applications, Vol.31, No.1, pp. 95-117, 2006.

[7] [7] M. Serata, Y. Hatakeyama, and K. Hirota, “Designing Image Retrieval System with the Concept of Visual Keys,” J. of Advanced Computational Intelligence and Intelligent Informatics, Vol.10, No.2, pp. 136- 144, 2006.

[8] [8] K. Okamoto, F. Dong, S. Yoshida, and K. Hirota, “DCT domain features based image index (in Japanese),” J. of Japan Society for Fuzzy Theory and Intelligent Informatics, Vol.21, No.6, pp. 1092-1102, 2009.

[9] [9] G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, “Visual categorization with bags of keypoints,” In Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1-22, 2004.

[10] [10] G. Ding, J. Wang, and K. Qin, “A visual word weighting scheme based on emerging itemsets for video annotation,” Information Processing Letters, Vol.110, Issue 16, pp. 692-696, 2010.

[11] [11] Y. G. Jiang and C. W. Ngo, “Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval,” Computer Vision and Image Understanding, Vol.113, No.3, pp. 405-414, 2009.

[12] [12] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” In IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1-8, 2007.

[13] [13] J. Sivic and A. Zisserman, “Video Google: A text retrieval approach to object matching in videos,” In Ninth IEEE Int. Conf. on Computer Vision, Vol.2, pp. 1470-1477, 2003.

[14] [14] L. Zhu, A. B. Rao, and A. Zhang, “Theory of keyblock-based image retrieval,” ACM Trans. on Information Systems, Vol.20, Issue 2, pp. 224-257, 2002.

[15] [15] A. Hub, D. Blank, A. Henrich, and W. Müller, “Picadomo: Faceted image browsing for mobile devices,” In Seventh Int. Workshop on Content-Based Multimedia Indexing, pp. 249-254, 2009.

[16] [16] C. J. van Rijsbergen, “Information retrieval,” Department of Computing Science University of Glasgow, 1979.

[17] [17] E. M. Voorhees and D. M. Tice, “The TREC-8 question answering track evaluation,” In Text Retrieval Conf. TREC-8, pp. 83-105, 1999.

[18] [18] D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful seeding,” In Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027-1035, 2007.

[19] [19] M. Friedman, “The use of ranks to avoid the assumption of normality implicit in the analysis of variance,” J. of the American Statistical Association, Vol.32, No.200, pp. 675-701, 1937.

[20] [20] F. Wilcoxon, “Individual comparisons by ranking methods,” Biometrics Bulletin, Vol.1, No.6, pp. 80-83, 1945.

[21] [21] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: Image segmentation using expectation-maximization and its application to image querying,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.24, Issue 8, pp. 1026-1038, 2002.

[22] [22] J. Fauqueur and N. Boujemaa, “Region-based image retrieval: fast coarse segmentation and fine color description,” J. of Visual Languages & Computing, Vol.15, No.1, pp. 69-95, 2004.

[23] [23] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF: Speeded up robust features,” Computer Vision and Image Understanding, Vol.110, No.3, pp. 346-359, 2008.

[24] [24] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. of Computer Vision, Vol.60, No.2, pp. 91-110, 2004.

[25] [25] M. Stricker and M. Orengo, “Similarity of color images,” In Storage and Retrieval for Image and Video Databases III, Vol.2420, pp. 381-392, 1995.

[26] [26] J. Laaksonen, E. Oja, M. Koskela, and S. Brandt, “Analyzing lowlevel visual features using content-based image retrieval,” In Int. Conf. on Neural Information Processing, pp. 1333-1338, 2000.

[27] [27] S. Brandt, J. Laaksonen, and E. Oja, “Statistical shape features for content-based image retrieval,” J. of Mathematical Imaging and Vision, Vol.17, No.2, pp. 187-198, 2002.

[28] [28] B. S. Manjunath, J. R. Ohm, V. V. Vasudevan, and A. Yamada, “Color and texture descriptors,” IEEE Trans. on Circuits and Systems for Video Technology, Vol.11, No.6, pp. 703-715, 2001.

[29] [29] ITU, “Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios,” ITU-R Recommendation BT.601-6, 2007.

An Evaluation Strategy for Visual Key Image Retrieval on Mobile Devices

Kazushi Okamoto*1, Kazuhiko Kawamoto*1,*2, Fangyan Dong*3, Shinichi Yoshida*4, and Kaoru Hirota*3

Kazushi Okamoto^1, Kazuhiko Kawamoto^1,2, Fangyan Dong^3,
Shinichi Yoshida^4, and Kaoru Hirota^3