ADSTrack: adaptive dynamic sampling for visual tracking
Abstract The most common method for visual object tracking involves feeding an image pair comprising a template image and search region into a tracker. The tracker uses a backbone to process the information in the image pair. In pure Transformer-based frameworks, redundant information in image pairs...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2024-12-01
|
Series: | Complex & Intelligent Systems |
Subjects: | |
Online Access: | https://doi.org/10.1007/s40747-024-01672-0 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832571150776926208 |
---|---|
author | Zhenhai Wang Lutao Yuan Ying Ren Sen Zhang Hongyu Tian |
author_facet | Zhenhai Wang Lutao Yuan Ying Ren Sen Zhang Hongyu Tian |
author_sort | Zhenhai Wang |
collection | DOAJ |
description | Abstract The most common method for visual object tracking involves feeding an image pair comprising a template image and search region into a tracker. The tracker uses a backbone to process the information in the image pair. In pure Transformer-based frameworks, redundant information in image pairs exists throughout the tracking process and the corresponding negative tokens consume the same computational resources as the positive tokens while degrading the performance of the tracker. Therefore, we propose to solve this problem using an adaptive dynamic sampling strategy in a pure Transformer-based tracker, known as ADSTrack. ADSTrack progressively reduces irrelevant, redundant negative tokens in the search region that are not related to the tracked objectand the effect of noise generated by these tokens. The adaptive dynamic sampling strategy enhances the performance of the tracker by scoring and adaptive sampling of important tokens, and the number of tokens sampled varies according to the input image. Moreover, the adaptive dynamic sampling strategy is a parameterless token sampling strategy that does not use additional parameters. We add several extra tokens as auxiliary tokens to the backbone to further optimize the feature map. We extensively evaluate ADSTrack, achieving satisfactory results for seven test sets, including UAV123 and LaSOT. |
format | Article |
id | doaj-art-266d919471a74e26b8f864f3788a4c2a |
institution | Kabale University |
issn | 2199-4536 2198-6053 |
language | English |
publishDate | 2024-12-01 |
publisher | Springer |
record_format | Article |
series | Complex & Intelligent Systems |
spelling | doaj-art-266d919471a74e26b8f864f3788a4c2a2025-02-02T12:48:52ZengSpringerComplex & Intelligent Systems2199-45362198-60532024-12-0111111410.1007/s40747-024-01672-0ADSTrack: adaptive dynamic sampling for visual trackingZhenhai Wang0Lutao Yuan1Ying Ren2Sen Zhang3Hongyu Tian4College of Information Science and Engineering Linyi UniversityCollege of Information Science and Engineering Linyi UniversityCollege of Information Science and Engineering Linyi UniversityCollege of Information Science and Engineering Linyi UniversitySchool of Physics and Electronic EngineeringAbstract The most common method for visual object tracking involves feeding an image pair comprising a template image and search region into a tracker. The tracker uses a backbone to process the information in the image pair. In pure Transformer-based frameworks, redundant information in image pairs exists throughout the tracking process and the corresponding negative tokens consume the same computational resources as the positive tokens while degrading the performance of the tracker. Therefore, we propose to solve this problem using an adaptive dynamic sampling strategy in a pure Transformer-based tracker, known as ADSTrack. ADSTrack progressively reduces irrelevant, redundant negative tokens in the search region that are not related to the tracked objectand the effect of noise generated by these tokens. The adaptive dynamic sampling strategy enhances the performance of the tracker by scoring and adaptive sampling of important tokens, and the number of tokens sampled varies according to the input image. Moreover, the adaptive dynamic sampling strategy is a parameterless token sampling strategy that does not use additional parameters. We add several extra tokens as auxiliary tokens to the backbone to further optimize the feature map. We extensively evaluate ADSTrack, achieving satisfactory results for seven test sets, including UAV123 and LaSOT.https://doi.org/10.1007/s40747-024-01672-0Object trackingAdaptive transformerDynamic tokenAuxiliary token |
spellingShingle | Zhenhai Wang Lutao Yuan Ying Ren Sen Zhang Hongyu Tian ADSTrack: adaptive dynamic sampling for visual tracking Complex & Intelligent Systems Object tracking Adaptive transformer Dynamic token Auxiliary token |
title | ADSTrack: adaptive dynamic sampling for visual tracking |
title_full | ADSTrack: adaptive dynamic sampling for visual tracking |
title_fullStr | ADSTrack: adaptive dynamic sampling for visual tracking |
title_full_unstemmed | ADSTrack: adaptive dynamic sampling for visual tracking |
title_short | ADSTrack: adaptive dynamic sampling for visual tracking |
title_sort | adstrack adaptive dynamic sampling for visual tracking |
topic | Object tracking Adaptive transformer Dynamic token Auxiliary token |
url | https://doi.org/10.1007/s40747-024-01672-0 |
work_keys_str_mv | AT zhenhaiwang adstrackadaptivedynamicsamplingforvisualtracking AT lutaoyuan adstrackadaptivedynamicsamplingforvisualtracking AT yingren adstrackadaptivedynamicsamplingforvisualtracking AT senzhang adstrackadaptivedynamicsamplingforvisualtracking AT hongyutian adstrackadaptivedynamicsamplingforvisualtracking |