single-jc.php

JACIII Vol.27 No.2 pp. 182-189
doi: 10.20965/jaciii.2023.p0182
(2023)

Research Paper:

Fused Architecture with Enhanced Bag of Visual Words for Efficient Drowsiness Detection

Vineetha Vijayan ORCID Icon and K. P. Pushpalatha

Mahatma Gandhi University
Priyadarsini Hills, Kottayam, Kerala 686560, India

Received:
May 25, 2022
Accepted:
October 22, 2022
Published:
March 20, 2023
Keywords:
SIFT, FLANN-SIFT, bag of visual words, AlexNet, k-means
Abstract

Drowsy driving is more hazardous than reckless driving. This study concentrates on capturing the behavioral features of drowsiness from facial images of a driver. The methodology considers scale invariant feature transform matched with the fast library for approximate nearest neighbors for low-level drowsy features extraction. These features are fused with the high-level features extracted from the convolutional layers of a convolutional neural network (CNN). The convolution operation incorporates a model parallelization technique to increase the efficiency of the training and improve the feature identification. Further classification is performed by considering the occurrences of visual words using the softmax layers of the CNN. In contrast to existing state-of-the-art models which require a few seconds to detect drowsiness, this model detects drowsiness in milliseconds. With the model parallelization approach, this model exhibits a high accuracy rate of 83.8% relative to normal CNNs.

Realtime drowsiness detection

Realtime drowsiness detection

Cite this article as:
V. Vijayan and K. Pushpalatha, “Fused Architecture with Enhanced Bag of Visual Words for Efficient Drowsiness Detection,” J. Adv. Comput. Intell. Intell. Inform., Vol.27 No.2, pp. 182-189, 2023.
Data files:
References
  1. [1] V. Vineetha and K. P. Pushpalatha, “FLANN based matching with SIFT descriptors for drowsy features extraction,” 5th Int. Conf. on Image Information Processing (ICIIP), pp. 600-605, 2019. https://doi.org/10.1109/ICIIP47207.2019.8985924
  2. [2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, Vol.25, pp. 1097-1105, 2012.
  3. [3] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Trans. on Knowledge and Data Engineering, Vol.22, No.10, pp. 1345-1359, 2009. https://doi.org/10.1109/TKDE.2009.191
  4. [4] V. Maeda-Gutiérrez, C. E. Galvan-Tejada, L. A. Zanella-Calzada, J. M. Celaya-Padilla, J. I. Galván-Tejada, H. Gamboa-Rosales, H. Luna-Garcia, R. Magallanes-Quintanar, C. A. Guerrero Mendez, and C. A. Olvera-Olvera, “Comparison of convolutional neural network architectures for classification of tomato plant diseases,” Applied Sciences, Vol.10, No.4, Article No.1245, 2020. https://doi.org/10.3390/app10041245
  5. [5] A. Bosch, X. Muñoz, and R. Marti, “Which is the best way to organize/classify images by content?,” Image and Vision Computing, Vol.25, No.6, pp. 778-791, 2007. https://doi.org/10.1016/j.imavis.2006.07.015
  6. [6] L. Fei-Fei and P. Perona, “A Bayesian hierarchical model for learning natural scene categories,” 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR’05), Vol.2, pp. 524-531, 2005. https://doi.org/10.1109/CVPR.2005.16
  7. [7] W. Rawat and Z. Wang, “Deep convolutional neural networks for image classification: A comprehensive review,” Neural Computation, Vol.29, No.9, pp. 2352-2449, 2017. https://doi.org/10.1162/neco_a_00990
  8. [8] L. Zheng, Y. Yang, and Q. Tian, “SIFT meets CNN: A decade survey of instance retrieval,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.40, No.5, pp. 1224-1244, 2018. https://doi.org/10.1109/TPAMI.2017.2709749
  9. [9] A. Babenko and V. Lempitsky, “Aggregating deep convolutional features for image retrieval,” arXiv:1510.07493, 2015.
  10. [10] H. Jégou, M. Douze, C. Schmid, and P. Pérez, “Aggregating local descriptors into a compact image representation,” IEEE Computer Society Conf. on CVPR, pp. 3304-3311, 2010. https://doi.org/10.1109/CVPR.2010.5540039
  11. [11] F. Perronnin, Y. Liu, J. Sánchez, and H. Poirier, “Large-scale image retrieval with compressed Fisher vectors,” IEEE Computer Society Conf. on CVPR, pp. 3384-3391, 2010. https://doi.org/10.1109/CVPR.2010.5540009
  12. [12] K. Yan, Y. Wang, D. Liang, T. Huang, and Y. Tian, “CNN vs. SIFT for image retrieval: Alternative or complementary?,” Proc. of the 24th ACM Int. Conf. on Multimedia, pp. 407-411, 2016. https://doi.org/10.1145/2964284.2967252
  13. [13] T. Connie, M. Al-Shabi, W. P. Cheah, and M. Goh, “Facial expression recognition using a hybrid CNN-SIFT aggregator,” Int. Workshop on Multi-Disciplinary Trends in Artificial Intelligence, pp. 139-149, 2017. https://doi.org/10.1007/978-3-319-69456-6_12
  14. [14] X.-S. Wei, J.-H. Luo, J. Wu, and Z.-H. Zhou, “Selective convolutional descriptor aggregation for fine-grained image retrieval,” IEEE Trans. on Image Processing, Vol.26, No.6, pp. 2868-2881, 2017. https://doi.org/10.1109/TIP.2017.2688133
  15. [15] S. Huang and H.-M. Hang, “Multi-query image retrieval using CNN and SIFT features,” Asia-Pacific Signal and Information Processing Association Annual Summit and Conf. (APSIPA ASC), pp. 1026-1034, 2017. https://doi.org/10.1109/APSIPA.2017.8282180
  16. [16] S. S. Shanta, S. T. Anwar, and M. R. Kabir, “Bangla sign language detection using SIFT and CNN,” 9th Int. Conf. on Computing, Communication and Networking Technologies (ICCCNT), 2018. https://doi.org/10.1109/ICCCNT.2018.8493915
  17. [17] A. Kumar, N. Jain, C. S. Singh, and S. Tripathi, “Exploiting SIFT descriptor for rotation invariant convolutional neural network,” 15th IEEE India Council Int. Conf. (INDICON), 2018. https://doi.org/10.1109/INDICON45594.2018.8987153
  18. [18] W. Chen, Q. Sun, J. Wang, J.-J. Dong, and C. Xu, “A novel model based on AdaBoost and deep CNN for vehicle classification,” IEEE Access, Vol.6, pp. 60445-60455, 2018. https://doi.org/10.1109/ACCESS.2018.2875525
  19. [19] H.-W. Ng, V. D. Nguyen, V. Vonikakis, and S. Winkler, “Deep learning for emotion recognition on small datasets using transfer learning,” Proc. of the 2015 ACM on Int. Conf. on Multimodal Interaction, pp. 443-449, 2015. https://doi.org/10.1145/2818346.2830593
  20. [20] X. Zheng and N. Liu, “Color recognition of clothes based on k-means and mean shift,” IEEE Int. Conf. on Intelligent Control, Automatic Detection and High-End Equipment, pp. 49-53, 2012. https://doi.org/10.1109/ICADE.2012.6330097
  21. [21] Y. Chen, P. Hu, and W. Wang, “Improved K-means algorithm and its implementation based on mean shift,” 11th Int. Congress on Image and Signal Processing, Biomedical Engineering and Informatics (CISP-BMEI), 2018. https://doi.org/10.1109/CISP-BMEI.2018.8633100
  22. [22] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. of Computer Vision, Vol.60, pp. 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  23. [23] R. N. Bracewell, “The Fourier transform and its applications,” McGraw-Hill, 1986.
  24. [24] K. Diaz-Chito, A. Hernández-Sabaté, and A. M. López, “A reduced feature set for driver head pose estimation,” Applied Soft Computing, Vol.45, pp. 98-107, 2016. https://doi.org/10.1016/j.asoc.2016.04.027
  25. [25] A. Kasiński, A. Florek, and A. Schmidt, “The PUT face database,” Image Processing and Communications, Vol.13, Nos.3-4, pp. 59-64, 2008.
  26. [26] C.-H. Weng, Y.-H. Lai, and S.-H. Lai, “Driver drowsiness detection via a hierarchical temporal deep belief network,” Asian Conf. on Computer Vision, pp. 117-133, 2016. https://doi.org/10.1007/978-3-319-54526-4_9

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Apr. 22, 2024