Utilizing Google Images for Training Classifiers in CRF-Based Semantic Segmentation
Rizki Perdana Rangkuti*, Vektor Dewanto*,**, Aprinaldi*, and Wisnu Jatmiko*
*Faculty of Computer Science, Universitas Indonesia
16424, Depok, Jawa Barat, Indonesia
**Department of Computer Science, Bogor Agricultural University
Bogor, Jawa Barat, Indonesia
One promising approach to pixel-wise semantic segmentation is based on conditional random fields (CRFs). CRF-based semantic segmentation requires ground-truth annotations to supervisedly train the classifier that generates unary potentials. However, the number of (public) annotation data for training is limitedly small. We observe that the Internet can provide relevant images for any given keywords. Our idea is to convert keyword-related images to pixel-wise annotated images, then use them as training data. In particular, we rely on saliency filters to identify the salient object (foreground) of a retrieved image, which mostly agrees with the given keyword. We utilize saliency information for back-and-foreground CRF-based semantic segmentation to further obtain pixel-wise ground-truth annotations. Experiment results show that training data from Google images improves both the learning performance and the accuracy of semantic segmentation. This suggests that our proposed method is promising for harvesting substantial training data from the Internet for training the classifier in CRF-based semantic segmentation.
-  J. Shotton, J. Winn, C. Rother, and A. Criminisi, “TextonBoostfor Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context,” Int. J. Comput. Vision, Vol.81, No.1, pp. 2-23, 2009.
-  P. Kohli, L. Ladickacutey, and P. H. Torr, “Robust Higher Order Potentials for Enforcing Label Consistency,” Int. J. Comput. Vision, Vol.82, No.3, pp. 302-324, 2009.
-  P. Krähenbühl and V. Koltun, “Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials,” Advances in Neural Information Processing Systems, Vol.24, pp. 109-117, Curran Associates, Inc., 2011.
-  M. Szummer, P. Kohli, and D. Hoiem, “Learning CRFs using Graph Cuts,” European Conf. on Computer Vision, 2008.
-  T. Joachims, T. Hofmann, Y. Yue, and C. N. Yu, “Predicting Structured Objects with Support Vector Machines,” Communications of the ACM, Research Highlight, Vol.52, No.11, pp. 97-104, 2009.
-  M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results,” http://www.pascal-network.org/challenges/VOC/voc2010/workshop/index.html [Accessed November 2014].
-  Research.microsoft.com, “Object Class Recognition – Microsoft Research,” 2015.
-  S. Geman and D. Geman, “Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images,” IEEE Trans. Pattern Anal. Mach. Intell., Vol.6, No.6, pp. 721-741, 1984.
-  J. M. Hammersley and P. Clifford, “Markov fields on finite graphs and lattices,” 1971.
-  F. Perazzi, P. Krähenbühl, Y. Pritch, and A. Hornung, “Saliencyfilters: Contrast based filtering for salient region detection,” CVPR, pp. 733-740, 2012.
-  R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk, “SLIC Superpixels,” Technical Report 149300, EPFL, 2010.