A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation
Video instance segmentation, a key technology for intelligent sensing in visual perception, plays a key role in automated surveillance, robotics, and smart cities. These scenarios rely on real-time and efficient target-tracking capabilities for accurate perception and intelligent analysis of dynamic...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-01-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/25/2/459 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832587535548678144 |
---|---|
author | Mingzhu Liu Wei Zhang Haoran Wei |
author_facet | Mingzhu Liu Wei Zhang Haoran Wei |
author_sort | Mingzhu Liu |
collection | DOAJ |
description | Video instance segmentation, a key technology for intelligent sensing in visual perception, plays a key role in automated surveillance, robotics, and smart cities. These scenarios rely on real-time and efficient target-tracking capabilities for accurate perception and intelligent analysis of dynamic environments. However, traditional video instance segmentation methods face complex models, high computational overheads, and slow segmentation speeds in time-series feature extraction, especially in resource-constrained environments. To address these challenges, a Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation (DCFA-LVIS) is proposed in this paper. In feature extraction, a DCEResNet backbone network structure based on a dual-channel feature enhancement mechanism is designed to improve the model’s accuracy by enhancing the feature extraction and representation capabilities. In instance tracking, a dual-frequency perceptual enhancement network structure is constructed, which uses an independent instance query mechanism to capture temporal information and combines with a frequency-aware attention mechanism to capture instance features on different attention layers of high and low frequencies, respectively, to effectively reduce the complexity of the model, decrease the number of parameters, and improve the segmentation efficiency. Experiments show that the model proposed in this paper achieves state-of-the-art segmentation performance with few parameters on the YouTube-VIS dataset, demonstrating its efficiency and practicality. This method significantly enhances the application efficiency and adaptability of visual perception intelligent sensing technology in video data acquisition and processing, providing strong support for its widespread deployment. |
format | Article |
id | doaj-art-ece152747009491a941d9fcc75f763e5 |
institution | Kabale University |
issn | 1424-8220 |
language | English |
publishDate | 2025-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj-art-ece152747009491a941d9fcc75f763e52025-01-24T13:49:00ZengMDPI AGSensors1424-82202025-01-0125245910.3390/s25020459A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance SegmentationMingzhu Liu0Wei Zhang1Haoran Wei2The Higher Educational Key Laboratory for Measuring & Control Technology and Instrumentation of Heilongjiang Province, Harbin University of Science and Technology, Harbin 150080, ChinaThe Higher Educational Key Laboratory for Measuring & Control Technology and Instrumentation of Heilongjiang Province, Harbin University of Science and Technology, Harbin 150080, ChinaThe Higher Educational Key Laboratory for Measuring & Control Technology and Instrumentation of Heilongjiang Province, Harbin University of Science and Technology, Harbin 150080, ChinaVideo instance segmentation, a key technology for intelligent sensing in visual perception, plays a key role in automated surveillance, robotics, and smart cities. These scenarios rely on real-time and efficient target-tracking capabilities for accurate perception and intelligent analysis of dynamic environments. However, traditional video instance segmentation methods face complex models, high computational overheads, and slow segmentation speeds in time-series feature extraction, especially in resource-constrained environments. To address these challenges, a Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation (DCFA-LVIS) is proposed in this paper. In feature extraction, a DCEResNet backbone network structure based on a dual-channel feature enhancement mechanism is designed to improve the model’s accuracy by enhancing the feature extraction and representation capabilities. In instance tracking, a dual-frequency perceptual enhancement network structure is constructed, which uses an independent instance query mechanism to capture temporal information and combines with a frequency-aware attention mechanism to capture instance features on different attention layers of high and low frequencies, respectively, to effectively reduce the complexity of the model, decrease the number of parameters, and improve the segmentation efficiency. Experiments show that the model proposed in this paper achieves state-of-the-art segmentation performance with few parameters on the YouTube-VIS dataset, demonstrating its efficiency and practicality. This method significantly enhances the application efficiency and adaptability of visual perception intelligent sensing technology in video data acquisition and processing, providing strong support for its widespread deployment.https://www.mdpi.com/1424-8220/25/2/459video understandingvideo transformervisual perception intelligent sensingvideo instance segmentationlightweight |
spellingShingle | Mingzhu Liu Wei Zhang Haoran Wei A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation Sensors video understanding video transformer visual perception intelligent sensing video instance segmentation lightweight |
title | A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation |
title_full | A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation |
title_fullStr | A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation |
title_full_unstemmed | A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation |
title_short | A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation |
title_sort | dual channel and frequency aware approach for lightweight video instance segmentation |
topic | video understanding video transformer visual perception intelligent sensing video instance segmentation lightweight |
url | https://www.mdpi.com/1424-8220/25/2/459 |
work_keys_str_mv | AT mingzhuliu adualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation AT weizhang adualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation AT haoranwei adualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation AT mingzhuliu dualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation AT weizhang dualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation AT haoranwei dualchannelandfrequencyawareapproachforlightweightvideoinstancesegmentation |