ADSTrack: adaptive dynamic sampling for visual tracking

Abstract The most common method for visual object tracking involves feeding an image pair comprising a template image and search region into a tracker. The tracker uses a backbone to process the information in the image pair. In pure Transformer-based frameworks, redundant information in image pairs...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhenhai Wang, Lutao Yuan, Ying Ren, Sen Zhang, Hongyu Tian
Format: Article
Language:English
Published: Springer 2024-12-01
Series:Complex & Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1007/s40747-024-01672-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832571150776926208
author Zhenhai Wang
Lutao Yuan
Ying Ren
Sen Zhang
Hongyu Tian
author_facet Zhenhai Wang
Lutao Yuan
Ying Ren
Sen Zhang
Hongyu Tian
author_sort Zhenhai Wang
collection DOAJ
description Abstract The most common method for visual object tracking involves feeding an image pair comprising a template image and search region into a tracker. The tracker uses a backbone to process the information in the image pair. In pure Transformer-based frameworks, redundant information in image pairs exists throughout the tracking process and the corresponding negative tokens consume the same computational resources as the positive tokens while degrading the performance of the tracker. Therefore, we propose to solve this problem using an adaptive dynamic sampling strategy in a pure Transformer-based tracker, known as ADSTrack. ADSTrack progressively reduces irrelevant, redundant negative tokens in the search region that are not related to the tracked objectand the effect of noise generated by these tokens. The adaptive dynamic sampling strategy enhances the performance of the tracker by scoring and adaptive sampling of important tokens, and the number of tokens sampled varies according to the input image. Moreover, the adaptive dynamic sampling strategy is a parameterless token sampling strategy that does not use additional parameters. We add several extra tokens as auxiliary tokens to the backbone to further optimize the feature map. We extensively evaluate ADSTrack, achieving satisfactory results for seven test sets, including UAV123 and LaSOT.
format Article
id doaj-art-266d919471a74e26b8f864f3788a4c2a
institution Kabale University
issn 2199-4536
2198-6053
language English
publishDate 2024-12-01
publisher Springer
record_format Article
series Complex & Intelligent Systems
spelling doaj-art-266d919471a74e26b8f864f3788a4c2a2025-02-02T12:48:52ZengSpringerComplex & Intelligent Systems2199-45362198-60532024-12-0111111410.1007/s40747-024-01672-0ADSTrack: adaptive dynamic sampling for visual trackingZhenhai Wang0Lutao Yuan1Ying Ren2Sen Zhang3Hongyu Tian4College of Information Science and Engineering Linyi UniversityCollege of Information Science and Engineering Linyi UniversityCollege of Information Science and Engineering Linyi UniversityCollege of Information Science and Engineering Linyi UniversitySchool of Physics and Electronic EngineeringAbstract The most common method for visual object tracking involves feeding an image pair comprising a template image and search region into a tracker. The tracker uses a backbone to process the information in the image pair. In pure Transformer-based frameworks, redundant information in image pairs exists throughout the tracking process and the corresponding negative tokens consume the same computational resources as the positive tokens while degrading the performance of the tracker. Therefore, we propose to solve this problem using an adaptive dynamic sampling strategy in a pure Transformer-based tracker, known as ADSTrack. ADSTrack progressively reduces irrelevant, redundant negative tokens in the search region that are not related to the tracked objectand the effect of noise generated by these tokens. The adaptive dynamic sampling strategy enhances the performance of the tracker by scoring and adaptive sampling of important tokens, and the number of tokens sampled varies according to the input image. Moreover, the adaptive dynamic sampling strategy is a parameterless token sampling strategy that does not use additional parameters. We add several extra tokens as auxiliary tokens to the backbone to further optimize the feature map. We extensively evaluate ADSTrack, achieving satisfactory results for seven test sets, including UAV123 and LaSOT.https://doi.org/10.1007/s40747-024-01672-0Object trackingAdaptive transformerDynamic tokenAuxiliary token
spellingShingle Zhenhai Wang
Lutao Yuan
Ying Ren
Sen Zhang
Hongyu Tian
ADSTrack: adaptive dynamic sampling for visual tracking
Complex & Intelligent Systems
Object tracking
Adaptive transformer
Dynamic token
Auxiliary token
title ADSTrack: adaptive dynamic sampling for visual tracking
title_full ADSTrack: adaptive dynamic sampling for visual tracking
title_fullStr ADSTrack: adaptive dynamic sampling for visual tracking
title_full_unstemmed ADSTrack: adaptive dynamic sampling for visual tracking
title_short ADSTrack: adaptive dynamic sampling for visual tracking
title_sort adstrack adaptive dynamic sampling for visual tracking
topic Object tracking
Adaptive transformer
Dynamic token
Auxiliary token
url https://doi.org/10.1007/s40747-024-01672-0
work_keys_str_mv AT zhenhaiwang adstrackadaptivedynamicsamplingforvisualtracking
AT lutaoyuan adstrackadaptivedynamicsamplingforvisualtracking
AT yingren adstrackadaptivedynamicsamplingforvisualtracking
AT senzhang adstrackadaptivedynamicsamplingforvisualtracking
AT hongyutian adstrackadaptivedynamicsamplingforvisualtracking