single-jc.php

JACIII Vol.28 No.6 pp. 1273-1283
doi: 10.20965/jaciii.2024.p1273
(2024)

Research Paper:

RCT-YOLOv8: A Tuna Detection Model for Distant-Water Fisheries Based on Improved YOLOv8

Qingyi Zhou and Yuqing Liu

College of Engineering Science and Technology, Shanghai Ocean University
No.999 Hucheng Ring Road, Pudong New Area, Shanghai 201306, China

Corresponding author

Received:
April 28, 2024
Accepted:
August 5, 2024
Published:
November 20, 2024
Keywords:
YOLOv8, deep learning, object detection, tuna detection
Abstract

With the development of distant-water fisheries, ship fishing and fish catch detection are now vital to modern fishing. Existing manual detection methods are prone to issues such as missed detections and false detections. Deep learning has enabled the deployment of detection models on shipboard devices, offering a new solution. However, many existing models are hindered by large parameters and computational complexity, making them unsuitable for shipboard use due to limited resources and costs onboard ships. To address these challenges, we propose the RCT-YOLOv8 model for tuna catch detection in this paper. Specifically, we adopt YOLOv8 as the base model and replace the network backbone with RepVGG network, which employs re-parameterized convolutions to enhance detection accuracy. Additionally, we incorporate coordinate attention at the end of the backbone to better aggregate channel-wise information. In the neck part, we introduce the contextual transformer (CoT) attention and propose the C2F-CoT model, which combines convolutional neural network with Transformer to capture global features, thereby improving detection accuracy and the effectiveness of feature propagation. We test multiple loss functions and select efficient intersection over union, which is more suitable for our algorithm. Furthermore, to adapt to devices with limited computational resources, we utilize the dependency-graph-based pruning method to compress the network model. Compared to the base network, the pruned model achieves a 9.8% increase in detection accuracy while reducing parameters and computational complexity by 40% and 35.8%, respectively. Compared to various algorithms, the pruned model demonstrates the highest detection accuracy, lowest parameter count, and lowest computational complexity, achieving optimal results at all fronts.

Cite this article as:
Q. Zhou and Y. Liu, “RCT-YOLOv8: A Tuna Detection Model for Distant-Water Fisheries Based on Improved YOLOv8,” J. Adv. Comput. Intell. Intell. Inform., Vol.28 No.6, pp. 1273-1283, 2024.
Data files:
References
  1. [1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, Vol.521, No.7553, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
  2. [2] J. Wang, X. Yin, and G. Li, “A real-time lightweight detection algorithm for deck crew and the use of fishing nets based on improved YOLOv5s network,” Fishes, Vol.8, No.7, Article No.376, 2023. https://doi.org/10.3390/fishes8070376
  3. [3] N. Ma, X. Zhang, H.-T. Zheng, and J. Sun, “ShuffleNet V2: Practical guidelines for efficient CNN architecture design,” Proc. of the 15th European Conf. on Computer Vision (ECCV 2018), Part 14, pp. 122-138, 2018. https://doi.org/10.1007/978-3-030-01264-9_8
  4. [4] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional block attention module,” Proc. of the 15th European Conf. on Computer Vision (ECCV), Part 7, pp. 3-19, 2018. https://doi.org/10.1007/978-3-030-01234-2_1
  5. [5] J. Li, C. Liu, X. Lu, and B. Wu, “CME-YOLOv5: An efficient object detection network for densely spaced fish and small targets,” Water, Vol.14, No.15, Article No.2412, 2022. https://doi.org/10.3390/w14152412
  6. [6] Q. Hou, D. Zhou, and J. Feng, “Coordinate attention for efficient mobile network design,” 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 13708-13717, 2021.
  7. [7] Y.-F. Zhang et al., “Focal and efficient IOU loss for accurate bounding box regression,” Neurocomputing, Vol.506, pp. 146-157, 2022. https://doi.org/10.1016/j.neucom.2022.07.042
  8. [8] H. Rezatofighi et al., “Generalized intersection over union: A metric and a loss for bounding box regression,” 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 658-666, 2019. https://doi.org/10.1109/CVPR.2019.00075
  9. [9] Y. Liu et al., “An improved Tuna-YOLO model based on YOLO v3 for real-time tuna detection considering lightweight deployment,” J. of Marine Science and Engineering, Vol.11, No.3, Article No.542, 2023. https://doi.org/10.3390/jmse11030542
  10. [10] A. Howard et al., “Searching for MobileNetV3,” 2019 IEEE/CVF Int. Conf. on Computer Vision (ICCV), pp. 1314-1324, 2019. https://doi.org/10.1109/ICCV.2019.00140
  11. [11] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” 2018 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, pp. 7132-7141, 2018. https://doi.org/10.1109/CVPR.2018.00745
  12. [12] K. Chen et al., “AP-loss for accurate one-stage object detection,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.43, No.11, pp. 3782-3798, 2021. https://doi.org/10.1109/TPAMI.2020.2991457
  13. [13] J. Luo, Z. Yang, S. Li, and Y. Wu, “FPCB surface defect detection: A decoupled two-stage object detection framework,” IEEE Trans. on Instrumentation and Measurement, Vol.70, Article No.5012311, 2021. https://doi.org/10.1109/TIM.2021.3092510
  14. [14] X. Li, M. Shang, H. Qin, and L. Chen, “Fast accurate fish detection and recognition of underwater images with Fast R-CNN,” Proc. of OCEANS 2015, 2015. https://doi.org/10.23919/OCEANS.2015.7404464
  15. [15] R. Girshick, “Fast R-CNN,” 2015 IEEE Int. Conf. on Computer Vision (ICCV), pp. 1440-1448, 2015. https://doi.org/10.1109/ICCV.2015.169
  16. [16] S. C. Mana and T. Sasipraba, “An intelligent deep learning enabled marine fish species detection and classification model,” Int. J. on Artificial Intelligence Tools, Vol.31, No.1, Article No.2250017, 2022. https://doi.org/10.1142/S0218213022500178
  17. [17] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” 2017 IEEE Int. Conf. on Computer Vision (ICCV), pp. 2980-2988, 2017. https://doi.org/10.1109/ICCV.2017.322
  18. [18] J. C. Ovalle, C. Vilas, and L. T. Antelo, “On the use of deep learning for fish species recognition and quantification on board fishing vessels,” Marine Policy, Vol.139, Article No.105015, 2022. https://doi.org/10.1016/j.marpol.2022.105015
  19. [19] Y. Nan, J. Ju, Q. Hua, H. Zhang, and B. Wang, “A-MobileNet: An approach of facial expression recognition,” Alexandria Engineering J., Vol.61, No.6, pp. 4435-4444, 2022. https://doi.org/10.1016/j.aej.2021.09.066
  20. [20] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, 2016. https://doi.org/10.1109/CVPR.2016.91
  21. [21] V. Kandimalla et al., “Automated detection, classification and counting of fish in fish passages with deep learning,” Frontiers in Marine Science, Vol.8, Article No.823173, 2022. https://doi.org/10.3389/fmars.2021.823173
  22. [22] A. Jalal, A. Salman, A. Mian, M. Shortis, and F. Shafait, “Fish detection and species classification in underwater environments using deep learning with temporal information,” Ecological Informatics, Vol.57, Article No.101088, 2020. https://doi.org/10.1016/j.ecoinf.2020.101088
  23. [23] S. Li, L. Yang, H. Yu, and Y. Chen, “Underwater fish species identification model and real-time identification system,” Smart Agriculture, Vol.4, No.1, pp. 130-139, 2022 (in Chinese). https://doi.org/10.12133/j.smartag.SA202202006
  24. [24] N. Hasan, Y. Bao, A. Shawon, and Y. Huang, “DenseNet convolutional neural networks application for predicting COVID-19 using CT image,” SN Computer Science, Vol.2, No.5, Article No.389, 2021. https://doi.org/10.1007/s42979-021-00782-7
  25. [25] K. M. Knausgård et al., “Temperate fish detection and classification: A deep learning based approach,” Applied Intelligence, Vol.52, No.6, pp. 6988-7001, 2022. https://doi.org/10.1007/s10489-020-02154-9
  26. [26] K. Cai et al., “A modified YOLOv3 model for fish detection based on MobileNetv1 as backbone,” Aquacultural Engineering, Vol.91, Article No.102117, 2020. https://doi.org/10.1016/j.aquaeng.2020.102117
  27. [27] M. Sohan, T. S. Ram, and C. V. R. Reddy, “A review on YOLOv8 and its advancements,” Proc. of the Int. Conf. on Data Intelligence and Cognitive Informatics (ICDICI 2023), pp. 529-545, 2024. https://doi.org/10.1007/978-981-99-7962-2_39
  28. [28] X. Ding et al., “RepVGG: Making VGG-style ConvNets great again,” 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 13728-13737, 2021. https://doi.org/10.1109/CVPR46437.2021.01352
  29. [29] J. Wei et al., “Chain-of-thought prompting elicits reasoning in large language models,” Proc. of the 36th Int. Conf. on Neural Information Processing Systems (NIPS’22), pp. 24824-24837, 2022.
  30. [30] Z. Zheng et al., “Distance-IoU loss: Faster and better learning for bounding box regression,” Proc. of the 34th AAAI Conf. on Artificial Intelligence (AAAI-20), pp. 12993-13000, 2020.
  31. [31] Y. Li, T. Yao, Y. Pan, and T. Mei, “Contextual Transformer networks for visual recognition,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.45, No.2, pp. 1489-1500, 2023. https://doi.org/10.1109/TPAMI.2022.3164083
  32. [32] Y. Jin, X. Tian, Z. Zhang, P. Liu, and X. Tang, “C2F: An effective coarse-to-fine network for video summarization,” Image and Vision Computing, Vol.144, Article No.104962, 2024. https://doi.org/10.1016/j.imavis.2024.104962
  33. [33] Z. Zheng et al., “Enhancing geometric factors in model learning and inference for object detection and instance segmentation,” IEEE Trans. on Cybernetics, Vol.52, No.8, pp. 8574-8586, 2022. https://doi.org/10.1109/TCYB.2021.3095305
  34. [34] K. Han et al., “GhostNet: More features from cheap operations,” 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 1577-1586, 2020. https://doi.org/10.1109/CVPR42600.2020.00165
  35. [35] M. Tan and Q. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” Proc. of the 36th Int. Conf. on Machine Learning, pp. 6105-6114, 2019.
  36. [36] S. Du, B. Zhang, and P. Zhang, “Scale-sensitive IOU loss: An improved regression loss function in remote sensing object detection,” IEEE Access, Vol.9, pp. 141258-141272, 2021. https://doi.org/10.1109/ACCESS.2021.3119562

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Jan. 08, 2025