single-au.php

IJAT Vol.19 No.4 pp. 608-617
doi: 10.20965/ijat.2025.p0608
(2025)

Technical Paper:

A Method of Constructing a Food Classification Image Dataset by Cleansing Web-Crawling Data

Kazuki Kiryu, Masaki Miyamoto, and Akio Nakamura

Tokyo Denki University
5 Senju-Asahi-cho, Adachi-ku, Tokyo 120-8551, Japan

Corresponding author

Received:
November 29, 2024
Accepted:
February 5, 2025
Published:
July 5, 2025
Keywords:
food recognition, convolutional neural network, web crawling, data cleansing
Abstract

We propose the construction of image datasets via data cleansing for food recognition using a convolutional neural network (CNN). A dataset was constructed by collecting food images and classes from web crawling sites that post cooking recipes. The collected images included images that cannot be effectively learned by the CNN. Examples include images of foods that look extremely similar to other foods, or images with mismatched foods and classes. Here, these images were termed “content and description discrepancy images.” The number of images was reduced using two criteria based on the food recognition results obtained using CNNs. The first criterion was a threshold for the difference in the estimated probabilities, and the second was whether the estimated class and food class matched. These criteria were applied using multiple classifiers. Based on the results, the dataset size was reduced and a new image dataset was constructed. A CNN was trained on the constructed image dataset, and the food recognition accuracy was calculated and compared using a test dataset. The results showed that the accuracy using the dataset constructed using the proposed method was 7.4% higher than that of the case using web crawling. This study demonstrates that the proposed method can efficiently construct a food image dataset, demonstrating the data-cleansing effect of the two selected criteria.

Cite this article as:
K. Kiryu, M. Miyamoto, and A. Nakamura, “A Method of Constructing a Food Classification Image Dataset by Cleansing Web-Crawling Data,” Int. J. Automation Technol., Vol.19 No.4, pp. 608-617, 2025.
Data files:
References
  1. [1] D. Sahoo, W. Hao, S. Ke, W. Xiongwei, H. Le, P. Achananuparp, E.-P. Lim, and S. C. H. Hoi, “FoodAI: Food image recognition via deep learning for smart food logging,” Proc. of the 25th ACM SIGKDD Int. Conf. on Knowledge Discovery & Data Mining (KDD’19), pp. 2260-2268, 2019. https://doi.org/10.1145/3292500.3330734
  2. [2] S. M. Ahmed, D. Joshitha, A. Swathika, S. Chandana, Sahhas, and V. K. Gunjan, “Dietary assessment by food image logging based on food calorie estimation implemented using deep learning,” Int. Conf. on Communications and Cyber Physical Engineering (ICCCE), pp. 1141-1148, 2023. https://doi.org/10.1007/978-981-19-8086-2_107
  3. [3] T. Oduru, A. Jordan, and A. Park, “Healthy vs. unhealthy food images: Image classification of Twitter images,” Int. J. Environ. Res. Public Health, Vol.19, No.2, Article No.923, 2022. https://doi.org/10.3390/ijerph19020923
  4. [4] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” Computer Vision and Pattern Recognition (CVPR), pp. 248-255, 2009. https://doi.org/10.1109/CVPR.2009.5206848
  5. [5] L. Bossard, M. Cuillaumin, and L. V. Gool, “Food-101 mining discriminative components with random forests,” European Conf. on Computer Vision (ECCV), pp. 446-461, 2014. https://doi.org/10.1007/978-3-319-10599-4_29
  6. [6] W. Min, Z. Wang, Y. Liu, M. Luo, L. Kang, X. Wei, X. Wei, and S. Jiang, “Large scale visual food recognition,” Trans. on Pattern Analysis and Machine Intelligence (TPAMI), Vol.45, pp. 9932-9949, 2023. https://doi.org/10.1109/TPAMI.2023.3237871
  7. [7] X. Zhu and X. Wu, “Class noise vs. attribute noise: A quantitative study,” Artificial Intelligence Review, Vol.22, pp. 177-201, 2004. https://doi.org/10.1007/s10462-004-0751-8
  8. [8] J. A. Sáez, M. Galar, J. Lugengo, and F. Herrera, “INFFC: An iterative class noise filter based on the fusion of classifiers with noise sensitivity control,” ELSEVIER Information Fusion, Vol.27, pp. 19-32, 2016. https://doi.org/10.1016/j.inffus.2015.04.002
  9. [9] P. Kaur, K. Sikka, W. Wang, S. Belongie, and A. Divakaran, “FoodX-251: A dataset for fine-grained food classification,” arXiv:1907.06167, 2019. https://doi.org/10.48550/arXiv.1907.06167
  10. [10] K. Okamoto and K. Yanai, “UEC-FoodPix Complete: A large-scale food image segmentation dataset,” Int. Conf. on Pattern Recognition (ICPR), pp. 647-659, 2021. https://doi.org/10.1007/978-3-030-68821-9_51
  11. [11] J. Qin, F. P.-W. Lo, Y. Sun, S. Wang, and B. Lo, “Mining discriminative food regions for accurate food recognition,” arXiv:2207.03692, 2019. https://doi.org/10.48550/arXiv.2207.03692
  12. [12] X. Chen, Y. Zhu, H. Zhou, L. Diao, and D. Wang, “ChineseFoodNet: A large-scale image dataset for Chinese food recognition,” arXiv:1705.02743, 2017. https://doi.org/10.48550/arXiv.1705.02743
  13. [13] J. Jeon, V. Lavrenko, and R. Manmatha, “Automatic image annotation and retrieval using cross-media relevance models,” Proc. of the 26th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval (SIGIR’03), pp. 119-126, 2003. https://doi.org/10.1145/860435.860459
  14. [14] A. Nech and I. Kemelmacher-Shilizerman, “Level playing field for million scale face recognition,” Computer Vision and Pattern Recognition (CVPR), pp. 7044-7053, 2017. https://doi.ieeecomputersociety.org/10.1109/CVPR.2017.363
  15. [15] Z. Zhu, G. Huang, J. Deng, Y. Ye, J. Huang, X. Chen, J. Zhu, T. Yang, J. Lu, D. Du, and J. Zhou, “WebFace260M: A benchmark unveiling the power of million-scale deep face recognition,” Computer Vision and Pattern Recognition (CVPR), pp. 10492-10502, 2021. http://doi.org/10.1109/CVPR46437.2021.01035
  16. [16] W. Min, L. Liu, Z. Wang, Z. Luo, X. Wei, X. Wei, and S. Jiang, “ISIA Food-500: A dataset for large-scale food recognition via stacked global-local attention network,” Proc. of the 28th ACM Int. Conf. on Multimedia (MM’20), pp. 393-401, 2020. https://doi.org/10.1145/3394171.3414031
  17. [17] M. Miyamoto and A. Nakamura, “Study on constructing a dataset by web crawling for food recognition,” Vision Engineering Workshop (ViEW), pp. 1-5, 2020 (in Japanese).
  18. [18] Cookpad Official Website. https://cookpad.com/jp [Accessed May 28, 2020]
  19. [19] M. Sakurayama, K. Watabe, M. Miyamoto, S. Morita, and A. Nakamura, “Proposal for a dataset construction method in food recognition,” Dynamic Image Processing for Real Application Workshop (DIA), 2023 (in Japanese).
  20. [20] Y. Kawano and K. Yanai, “Automatic expansion of a food image dataset leveraging existing categories with domain adaptation,” European Conf. on Computer Vision (ECCV), pp. 3-17, 2014. https://doi.org/10.1007/978-3-319-16199-0_1
  21. [21] M. Tan and Q. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” Proc. of the 36th Int. Conf. Machine Learning Research (PMLR), pp. 6105-6114, 2019. https://doi.org/10.48550/arXiv.1905.11946
  22. [22] T. He, Z. Zhang, H. Zhang, J. Xie, and M. Li, “Bag of tricks for image classification with convolutional neural networks,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 558-567, 2019. http://doi.org/10.1109/CVPR.2019.00065
  23. [23] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 1251-1258, 2017. https://doi.org/10.1109/CVPR.2017.195

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Jul. 04, 2025