JRM Vol.23 No.6 pp. 1080-1090
doi: 10.20965/jrm.2011.p1080


Image-Searching for Office Equipment Using Bag-of-Keypoints and AdaBoost

Seiji Aoyagi, Atsushi Kohama, Yuki Inaura,
Masato Suzuki, and Tomokazu Takahashi

Faculty of Engineering, Kansai University, 3-3-35 Yamate-cho, Suita, Osaka 564-8680, Japan

October 9, 2010
July 4, 2011
December 20, 2011
bag-of-keypoints, SIFT, AdaBoost, specific object recognition

For an indoor mobile robot’s Simultaneous Localization And Mapping (SLAM), a method of processing only one monocular image (640×480 pixel) of the environment is proposed. This method imitates a human’s ability to grasp at a glance the overall situation of a room, i.e., its layout and any objects or obstacles in it. Specific object recognition of a desk through the use of several camera angles is dealt with as one example. The proposed method has the following steps. 1) The bag-of-keypoints method is applied to the image to detect the existence of the object in the input image. 2) If the existence of the object is verified, the angle of the object is further detected using the bag-ofkeypoints method. 3) The candidates for the projection from template image to input image are obtained using Scale Invariant Feature Transform (SIFT) or edge information. Whether or not the projected area correctly corresponds to the object is checked using the AdaBoost classifier, based on various image features such as Haar-like features. Through these steps, the desk is eventually extractedwith angle information if it exists in the image.

Cite this article as:
Seiji Aoyagi, Atsushi Kohama, Yuki Inaura,
Masato Suzuki, and Tomokazu Takahashi, “Image-Searching for Office Equipment Using Bag-of-Keypoints and AdaBoost,” J. Robot. Mechatron., Vol.23, No.6, pp. 1080-1090, 2011.
Data files:
  1. [1] S. Thrun, W. Burgard, and D. Fox, “Probabilistic Robotics,” Cambridge, MA New York, MIT Press, 2005 (Japanese version by R. Ueda, Mainichi-Communications, 2007).
  2. [2] A. Davison, “Real-time Simultaneous Localization and Mapping with a Single Camera,” Proc. ICCV, pp. 1403-1410, 2003.
  3. [3] M. Tomono, “3-D Object Map Building Using Dense Object Models with SIFT-based Recognition Features,” Proc. IROS 2006, pp. 1885-1890, 2006.
  4. [4] R. Sim and J. J. Little, “Autonomous Vision-Based Exploration and Mapping Using Hybrid Maps and Rao-Blackwellised Particle Filters,” Proc. IROS 2006, pp. 2082-2089, 2006.
  5. [5] R. Kurazume, H. Yamada, K. Murakami, Y. Iwashita, and T. Hasegawa, “Target Tracking Using SIR and MCMC Particle Filters by Multiple Cameras and Laser Range Finders,” Proc. IROS 2008, pp. 3838-3844, 2008.
  6. [6] S. Ikeda and J. Miura, “3D Indoor Environment Modeling by a Mobile Robot with Omnidirectional Stereo and Laser Range Finder,” Proc. IROS 2006, pp. 3435-3440, 2006.
  7. [7] J. Neira, A. J. Devison, and J. J. Leonard (Eds.), “Special Issue on Visual SLAM,” IEEE Trans. Robotics, Vol.24, No.5, pp. 929-1093, 2008.
  8. [8] S. Aoyagi, N. Hattori, S. Komai, M. Suzuki, M. Takano, and E. Fukui, “Building 3DMap by Object Detection and Recognition Using SIFT for One Monocular Image of a Whole View of Room - Assistance of Invisible Floor Marks, Partial Templates, Spatial Relation to Floor, and Semantic Information on Object Shape -,” Proc. Meeting on Image Recognition and Understanding 2009 (MIRU 2009), IS3-57, 2009.
  9. [9] G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, “Visual Categorization with Bags of Keypoints,” Proc. of ECCVWorkshop on Statistical Learning in Computer Vision, pp. 59-74, 2004.
  10. [10] C. D. Manning and H. Schtze, “Foundations of Statistical Natural Language Processing,” The MIT Press, 1999.
  11. [11] D. G. Lowe, “Object Recognition from Local Scale-Invariant Features,” Proc. ICCV, pp. 1150-1157, 1999.
  12. [12] Y. Freund and R. E. Schapire, “Experiments with a new boosting algorithm,” Proc. of the 13th Intl. Conf. Machine Learning, pp. 148-156, 1996.
  13. [13] O. Carmichael and M. Hebert, “Object Recognition by a Cascade of Edge Probes,” Proc. of the British Machine Vision Conference, 2002.
  14. [14] F. F. Li and P. Perona, “A Bayesian Hierarchical Model for Learning Natural Scene Categories,” Proc. Computer Vision and Pattern Recognition (CVPR), Vol.2, pp. 524-53, 2005.
  15. [15] K. Takita, T. Aoki, Y. Sasaki, T. Higuchi, and K. Kobayashi, “Highaccuracy Subpixel Image Registration Based on Phase-only Correlation,” IEICE Trans. Fundamentals, Vol.E86-A, No.8, pp. 1925-1934, 2003.
  16. [16] P. Viola and M. J. Jones, “Robust Real-Time Face Detection,” Int. J. Computer Vision, Vol.57, No.2, pp. 137-154, 2004.
  17. [17] S. Sinha, J. M. Frahm, M. Pollefeys, and Y. Genc, “GPU-based Video Feature Tracking and Matching,” Tech. Rep. TR06-012, University of North Carolina at Chapel Hill, 2006.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Mar. 01, 2021