Research Paper:
Deep Learning-Based Scallop Detection in Seabed Images Using Active Learning and Model Comparison
Koichiro Enomoto*1
, Koji Miyoshi*2
, Takuma Midorikawa*3, Yasuhiro Kuwahara*4, and Masashi Toda*5

*1Regional ICT Research Center of Human, Industry and Future, The University of Shiga Prefecture
2500 Hassaka-cho, Hikone, Shiga 522-8533, Japan
*2Hokkaido Research Organization, Fisheries Research Department, Central Fisheries Research Institute
Yoichi, Japan
*3Ebisu System Co., Ltd.
Sapporo, Japan
*4Hokkaido Research Organization, Fisheries Research Department, Mariculture Fisheries Research Institute
Muroran, Japan
*5Kumamoto University
Kumamoto, Japan
This study proposes a method for detecting scallops in seabed images using deep-learning based instance segmentation and active learning techniques. This method uses a mask region-based convolutional neural network (Mask R-CNN) combined with active learning to enable efficient annotation and adaptive learning in different seabed environments. A comparison with the transformer-based deformable detection transformer (Deformable DETR) model provides a detailed evaluation of the detection performance. The proposed method proves to be effective in detecting of object features while removing unnecessary background regions in noisy seabed environments. Active learning with margin sampling enhances the annotation process and creates an effective dataset from numerous seabed images. Experiments conducted on a large dataset of over 83,000 seabed images show that Mask R-CNN outperforms Deformable DETR, achieving an F-measure of 0.89 compared to 0.85. This study contributes to the field of fishery resource investigations by providing an approach for efficient learning using new data, which is crucial for maintaining accurate scallop detection systems over time.
- [1] Hokkaido Government, “Hokkaido Suisan Gensei,” Hokkaido Government, Sapporo, 2023 (in Japanese).
- [2] Marine Stewardship Council, “Japanese scallop hanging and seabed enhanced fisheries.” https://fisheries.msc.org/ [Accessed September 19, 2024]
- [3] K. Enomoto, M. Toda, and Y. Kuwahara, “Extraction Method of Scallop Area from Sand Seabed Images,” IEICE Trans. Inf. & Syst., Vol.97, pp. 130-139, 2014. https://doi.org/10.1587/transinf.E97.D.130
- [4] K. Enomoto, M. Toda, and Y. Kuwahara, “Discussion on a Method to Extract Scallop Using Line Convergence Index Filter from Granule-sand Seabed Videos,” IAPR Conf. on Machine Vision Applications, pp. 35-40, 2015. https://doi.org/10.1109/MVA.2015.7153127
- [5] J. Kitagawa, K. Enomoto, M. Toda, K. Miyoshi, and Y. Kuwahara, “A Study of Bottom Sediment Classification System Using Seabed Images,” Sensors and Materials, Vol.31, No.3, pp. 823-830, 2019. https://doi.org/10.18494/SAM.2019.2151
- [6] K. Miyoshi, Y. Kuwahara, J. Nagata, C. Yamazaki, T. Iijima, K. Enomoto, and M. Toda, “Improving stock management of maricultured scallops using seabed imaging and image analysis,” Asian Fisheries and Aquaculture Forum Abstract Book, pp. 100-101, 2022.
- [7] K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” Int. Conf. on Computer Vision, pp. 2980-2988, 2017. https://doi.org/10.1109/ICCV.2017.322
- [8] K. Wang, D. Zhang, Y. Li, R. Zhang, and L. Lin, “Cost-Effective Active Learning for Deep Image Classification,” IEEE Trans. on Circuits and Systems for Video Technology, Vol.27, No.12, pp. 2591-2600, 2017. https://doi.org/10.1109/TCSVT.2016.2589879
- [9] X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable DETR: Deformable Transformers for End-to-End Object Detection,” Int. Conf. on Learning Representations, 2021.
- [10] I. H. Chen and N. Belbachir, “Using Mask R-CNN for Underwater Fish Instance Segmentation as Novel Objects: A Proof of Concept,” Proc. of the Northern Lights Deep Learning Workshop, Vol.4, 2023. https://doi.org/10.7557/18.6791
- [11] A. Marburg and K. Bigham, “Deep learning for benthic fauna identification,” OCEANS 2016 MTS/IEEE Monterey, 2016. https://doi.org/10.1109/OCEANS.2016.7761146
- [12] S. Song, J. Zhu, X. Li, and Q. Huang, “Integrate MSRCR and Mask R-CNN to Recognize Underwater Creatures on Small Sample Datasets,” IEEE Access, Vol.8, pp. 172848-172858, 2020. https://doi.org/10.1109/ACCESS.2020.3025617
- [13] P. Muñoz-Benavent, J. Martínez-Peiró, G. Andreu-García, V. Puig-Pons, V. Espinosa, I. Pérez-Arjona, F. De la Gándara, and A. Ortega, “Impact evaluation of deep learning on image segmentation for automatic bluefin tuna sizing,” Aquacultural Engineering, Vol.99, 2022. https://doi.org/10.1016/j.aquaeng.2022.102299
- [14] P. Kannappan, J. H. Walker, A. Trembanis, and H. G. Tanner, “Identifying sea scallops from benthic camera images,” Limnol. Oceanogr.: Methods, Vol.12, pp. 680-693, 2014. https://doi.org/10.4319/lom.2014.12.680
- [15] C. Rasmussen, J. Zhao, D. Ferraro, and A. Trembanis, “Deep Census: AUV-Based Scallop Population Monitoring,” IEEE Int. Conf. on Computer Vision, pp. 2865-2873, 2017. https://doi.org/10.1109/ICCVW.2017.338
- [16] M. Natsuike, Y. Natsuike, M. Kanamori, and K. Honke, “Semi-automatic recognition of juvenile scallops reared in lantern nets from time-lapse images using a deep learning technique,” Plankton Benthos Res., Vol.17, No.1, pp. 91-94, 2022.
- [17] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” IEEE Conf. on Computer Vision and Pattern Recognition, pp. 7263-7271, 2017. https://doi.org/10.1109/CVPR.2017.690
- [18] Y. Wu, A. Kirillov, F. Massa, W. Y. Lo, and R. Girshick, “Detectron2.” https://github.com/facebookresearch/detectron2 [Accessed September 19, 2024]
- [19] T. Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollár, “Microsoft COCO: Common Objects in Context,” European Conf. on Computer Vision, pp. 740-755, 2014. https://doi.org/10.1007/978-3-319-10602-1_48
- [20] C. Shorten and T. M. Khoshgoftaar, “A survey on Image Data Augmentation for Deep Learning,” J. of Big Data, Vol.6, Article No.60, 2019. https://doi.org/10.1186/s40537-019-0197-0
- [21] S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, “Transformers in Vision: A Survey,” ACM Computing Surveys, Vol.54, No.10, pp. 1-41, 2022. https://doi.org/10.1145/3505244
- [22] Y. Liu, Y. Zhang, Y. Wang, F. Hou, J. Yuan, J. Tian, Y. Zhang, Z. Shi, J. Fan, and Z. He, “A Survey of Visual Transformers,” IEEE Trans. on Neural Networks and Learning Systems, Vol.34, No.6, pp. 7478-7498, 2022. https://doi.org/10.1109/TNNLS.2022.3227717
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.