Multifeature extraction based MobileViTv3 model for fish feeding behavior recognition from video streaming

The recognition of fish feeding behavior based on machine vision is essential for optimizing fish feeding strategies and enhancing the efficiency of aquaculture. Building an efficient, multi-feature extraction model for fish feeding recognition, especially on mobile and edge devices, remains a signi...

Full description

Saved in:
Bibliographic Details
Main Authors: Zheng Zhang, Menglu Chen, Qingsong Hu, Yanbing Shen
Format: Article
Language:English
Published: Elsevier 2025-03-01
Series:Ecological Informatics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1574954124004734
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832595391492653056
author Zheng Zhang
Menglu Chen
Qingsong Hu
Yanbing Shen
author_facet Zheng Zhang
Menglu Chen
Qingsong Hu
Yanbing Shen
author_sort Zheng Zhang
collection DOAJ
description The recognition of fish feeding behavior based on machine vision is essential for optimizing fish feeding strategies and enhancing the efficiency of aquaculture. Building an efficient, multi-feature extraction model for fish feeding recognition, especially on mobile and edge devices, remains a significant challenge. In the paper, we propose a novel multi-feature extraction (MFE)-MobileViTv3 model, which improve MobileViTv3 with the MFE blocks. It can extract spatio-temporal features while obtaining the lightweight characteristic. The MFE block is designed by improving ActionNet, with the frequency channel attention (FCA) and multi-head self-attention (MHSA) mechanisms. It can fully extract spatio-temporal, motion, and channel features from video streaming, thereby further improving the feature extraction capabilities, which subsequently enhances the model's recognition accuracy. The experiments were carried out on an industrial aquaculture farm. We built a dataset of Micropterus salmoides, and then conducted the compare experiments. Compared with C3D, R3D, ResNet50, SlowFast, AlexNet, and MobileNetV3, our model can achieve a classification accuracy of 96.7 % for feeding intensity, with fewer parameters (0.96 M) and FLOPs (8.44G). The results show that the proposed model can effectively recognize fish feeding behavior with fewer parameters. Additionally, we introduce two evaluation metrics for the feeding process: Average Feeding Intensity and Strong Feeding Ratio. The metrics are conducive to the quantitative evaluation of the fish's vigor and health status.
format Article
id doaj-art-48c7b19a003f45fba8188046441fc677
institution Kabale University
issn 1574-9541
language English
publishDate 2025-03-01
publisher Elsevier
record_format Article
series Ecological Informatics
spelling doaj-art-48c7b19a003f45fba8188046441fc6772025-01-19T06:24:34ZengElsevierEcological Informatics1574-95412025-03-0185102931Multifeature extraction based MobileViTv3 model for fish feeding behavior recognition from video streamingZheng Zhang0Menglu Chen1Qingsong Hu2Yanbing Shen3College of Engineering Science and Technology, Shanghai Ocean University, Shanghai 201306, ChinaCollege of Engineering Science and Technology, Shanghai Ocean University, Shanghai 201306, ChinaCollege of Engineering Science and Technology, Shanghai Ocean University, Shanghai 201306, ChinaCorresponding author.; College of Engineering Science and Technology, Shanghai Ocean University, Shanghai 201306, ChinaThe recognition of fish feeding behavior based on machine vision is essential for optimizing fish feeding strategies and enhancing the efficiency of aquaculture. Building an efficient, multi-feature extraction model for fish feeding recognition, especially on mobile and edge devices, remains a significant challenge. In the paper, we propose a novel multi-feature extraction (MFE)-MobileViTv3 model, which improve MobileViTv3 with the MFE blocks. It can extract spatio-temporal features while obtaining the lightweight characteristic. The MFE block is designed by improving ActionNet, with the frequency channel attention (FCA) and multi-head self-attention (MHSA) mechanisms. It can fully extract spatio-temporal, motion, and channel features from video streaming, thereby further improving the feature extraction capabilities, which subsequently enhances the model's recognition accuracy. The experiments were carried out on an industrial aquaculture farm. We built a dataset of Micropterus salmoides, and then conducted the compare experiments. Compared with C3D, R3D, ResNet50, SlowFast, AlexNet, and MobileNetV3, our model can achieve a classification accuracy of 96.7 % for feeding intensity, with fewer parameters (0.96 M) and FLOPs (8.44G). The results show that the proposed model can effectively recognize fish feeding behavior with fewer parameters. Additionally, we introduce two evaluation metrics for the feeding process: Average Feeding Intensity and Strong Feeding Ratio. The metrics are conducive to the quantitative evaluation of the fish's vigor and health status.http://www.sciencedirect.com/science/article/pii/S1574954124004734Micropterus salmoidesFeeding behavior recognitionMachine visionMultifeature extractionMobileViTv3
spellingShingle Zheng Zhang
Menglu Chen
Qingsong Hu
Yanbing Shen
Multifeature extraction based MobileViTv3 model for fish feeding behavior recognition from video streaming
Ecological Informatics
Micropterus salmoides
Feeding behavior recognition
Machine vision
Multifeature extraction
MobileViTv3
title Multifeature extraction based MobileViTv3 model for fish feeding behavior recognition from video streaming
title_full Multifeature extraction based MobileViTv3 model for fish feeding behavior recognition from video streaming
title_fullStr Multifeature extraction based MobileViTv3 model for fish feeding behavior recognition from video streaming
title_full_unstemmed Multifeature extraction based MobileViTv3 model for fish feeding behavior recognition from video streaming
title_short Multifeature extraction based MobileViTv3 model for fish feeding behavior recognition from video streaming
title_sort multifeature extraction based mobilevitv3 model for fish feeding behavior recognition from video streaming
topic Micropterus salmoides
Feeding behavior recognition
Machine vision
Multifeature extraction
MobileViTv3
url http://www.sciencedirect.com/science/article/pii/S1574954124004734
work_keys_str_mv AT zhengzhang multifeatureextractionbasedmobilevitv3modelforfishfeedingbehaviorrecognitionfromvideostreaming
AT mengluchen multifeatureextractionbasedmobilevitv3modelforfishfeedingbehaviorrecognitionfromvideostreaming
AT qingsonghu multifeatureextractionbasedmobilevitv3modelforfishfeedingbehaviorrecognitionfromvideostreaming
AT yanbingshen multifeatureextractionbasedmobilevitv3modelforfishfeedingbehaviorrecognitionfromvideostreaming