A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation

Video instance segmentation, a key technology for intelligent sensing in visual perception, plays a key role in automated surveillance, robotics, and smart cities. These scenarios rely on real-time and efficient target-tracking capabilities for accurate perception and intelligent analysis of dynamic...

Full description

Saved in:

Bibliographic Details
Main Authors:	Mingzhu Liu, Wei Zhang, Haoran Wei
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	Sensors
Subjects:	video understanding video transformer visual perception intelligent sensing video instance segmentation lightweight
Online Access:	https://www.mdpi.com/1424-8220/25/2/459
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832587535548678144
author	Mingzhu Liu Wei Zhang Haoran Wei
author_facet	Mingzhu Liu Wei Zhang Haoran Wei
author_sort	Mingzhu Liu
collection	DOAJ
description	Video instance segmentation, a key technology for intelligent sensing in visual perception, plays a key role in automated surveillance, robotics, and smart cities. These scenarios rely on real-time and efficient target-tracking capabilities for accurate perception and intelligent analysis of dynamic environments. However, traditional video instance segmentation methods face complex models, high computational overheads, and slow segmentation speeds in time-series feature extraction, especially in resource-constrained environments. To address these challenges, a Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation (DCFA-LVIS) is proposed in this paper. In feature extraction, a DCEResNet backbone network structure based on a dual-channel feature enhancement mechanism is designed to improve the model’s accuracy by enhancing the feature extraction and representation capabilities. In instance tracking, a dual-frequency perceptual enhancement network structure is constructed, which uses an independent instance query mechanism to capture temporal information and combines with a frequency-aware attention mechanism to capture instance features on different attention layers of high and low frequencies, respectively, to effectively reduce the complexity of the model, decrease the number of parameters, and improve the segmentation efficiency. Experiments show that the model proposed in this paper achieves state-of-the-art segmentation performance with few parameters on the YouTube-VIS dataset, demonstrating its efficiency and practicality. This method significantly enhances the application efficiency and adaptability of visual perception intelligent sensing technology in video data acquisition and processing, providing strong support for its widespread deployment.
format	Article
id	doaj-art-ece152747009491a941d9fcc75f763e5
institution	Kabale University
issn	1424-8220
language	English
publishDate	2025-01-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj-art-ece152747009491a941d9fcc75f763e52025-01-24T13:49:00ZengMDPI AGSensors1424-82202025-01-0125245910.3390/s25020459A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance SegmentationMingzhu Liu0Wei Zhang1Haoran Wei2The Higher Educational Key Laboratory for Measuring & Control Technology and Instrumentation of Heilongjiang Province, Harbin University of Science and Technology, Harbin 150080, ChinaThe Higher Educational Key Laboratory for Measuring & Control Technology and Instrumentation of Heilongjiang Province, Harbin University of Science and Technology, Harbin 150080, ChinaThe Higher Educational Key Laboratory for Measuring & Control Technology and Instrumentation of Heilongjiang Province, Harbin University of Science and Technology, Harbin 150080, ChinaVideo instance segmentation, a key technology for intelligent sensing in visual perception, plays a key role in automated surveillance, robotics, and smart cities. These scenarios rely on real-time and efficient target-tracking capabilities for accurate perception and intelligent analysis of dynamic environments. However, traditional video instance segmentation methods face complex models, high computational overheads, and slow segmentation speeds in time-series feature extraction, especially in resource-constrained environments. To address these challenges, a Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation (DCFA-LVIS) is proposed in this paper. In feature extraction, a DCEResNet backbone network structure based on a dual-channel feature enhancement mechanism is designed to improve the model’s accuracy by enhancing the feature extraction and representation capabilities. In instance tracking, a dual-frequency perceptual enhancement network structure is constructed, which uses an independent instance query mechanism to capture temporal information and combines with a frequency-aware attention mechanism to capture instance features on different attention layers of high and low frequencies, respectively, to effectively reduce the complexity of the model, decrease the number of parameters, and improve the segmentation efficiency. Experiments show that the model proposed in this paper achieves state-of-the-art segmentation performance with few parameters on the YouTube-VIS dataset, demonstrating its efficiency and practicality. This method significantly enhances the application efficiency and adaptability of visual perception intelligent sensing technology in video data acquisition and processing, providing strong support for its widespread deployment.https://www.mdpi.com/1424-8220/25/2/459video understandingvideo transformervisual perception intelligent sensingvideo instance segmentationlightweight
spellingShingle	Mingzhu Liu Wei Zhang Haoran Wei A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation Sensors video understanding video transformer visual perception intelligent sensing video instance segmentation lightweight
title	A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation
title_full	A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation
title_fullStr	A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation
title_full_unstemmed	A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation
title_short	A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation
title_sort	dual channel and frequency aware approach for lightweight video instance segmentation
topic	video understanding video transformer visual perception intelligent sensing video instance segmentation lightweight
url	https://www.mdpi.com/1424-8220/25/2/459
work_keys_str_mv	AT mingzhuliu adualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation AT weizhang adualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation AT haoranwei adualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation AT mingzhuliu dualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation AT weizhang dualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation AT haoranwei dualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation

A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation

Similar Items