Hierarchical Feature Attention Learning Network for Detecting Object and Discriminative Parts in Fine-Grained Visual Classification

This paper proposes a novel hierarchical feature attention learning network for improved fine-grained visual classification (FGVC). Existing fine-grained classification methods rely heavily on attention mechanisms to differentiate minute details of similar objects. These mechanisms often assume that...

Full description

Saved in:
Bibliographic Details
Main Authors: A. Yeong Han, Kwang Moo Yi, Kyeong Tae Kim, Jae Young Choi
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10854460/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832542563225042944
author A. Yeong Han
Kwang Moo Yi
Kyeong Tae Kim
Jae Young Choi
author_facet A. Yeong Han
Kwang Moo Yi
Kyeong Tae Kim
Jae Young Choi
author_sort A. Yeong Han
collection DOAJ
description This paper proposes a novel hierarchical feature attention learning network for improved fine-grained visual classification (FGVC). Existing fine-grained classification methods rely heavily on attention mechanisms to differentiate minute details of similar objects. These mechanisms often assume that critical locations have a similar scale and are uniquely localizable, which is not always accurate. For instance, the size of a bird may vary across images, and the color of its beak might only be significant for species identification when its wing and tail colors are specific. This paper addresses this limitation by proposing a so-called hierarchical feature attention learning network, which initially focuses on the target object within the image, followed by multi-headed attention to identify key discriminative locations (patches). Especially, we develop a novel hierarchical attention approach that appropriately reduces misleading attentions by considering the object’s size for capturing correct attention parts. In addition, the proposed multi-headed attention allows for examining more complementary attention parts to identify the most discriminative features. Further, our framework is implemented as an architectural constraint, eliminating the need for object or part-level annotations in a weakly supervised detection manner. We conducted extensive and comparative experiments on three benchmark datasets: NABirds, CUB-200, and Oxford 102 Flower. The results demonstrate that our proposed hierarchical attention approach provides a robust and efficient solution for improved FGVC. Specifically, our method achieved a top-1 accuracy increase of approximately 93.0%, 92.7%, and 99.4% on the CUB-200-2011, NABirds, and Oxford 102 Flower benchmarks, respectively.
format Article
id doaj-art-1a26481913c347d3a065096163443b1a
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-1a26481913c347d3a065096163443b1a2025-02-04T00:00:41ZengIEEEIEEE Access2169-35362025-01-0113195331954410.1109/ACCESS.2025.353444410854460Hierarchical Feature Attention Learning Network for Detecting Object and Discriminative Parts in Fine-Grained Visual ClassificationA. Yeong Han0Kwang Moo Yi1Kyeong Tae Kim2Jae Young Choi3https://orcid.org/0000-0002-3438-8248Department of Computer Engineering, Hankuk University of Foreign Studies, Yongin-si, Republic of KoreaDepartment of Computer Science, The University of British Columbia, Vancouver, BC, CanadaDepartment of Computer Engineering, Hankuk University of Foreign Studies, Yongin-si, Republic of KoreaDepartment of Computer Engineering, Hankuk University of Foreign Studies, Yongin-si, Republic of KoreaThis paper proposes a novel hierarchical feature attention learning network for improved fine-grained visual classification (FGVC). Existing fine-grained classification methods rely heavily on attention mechanisms to differentiate minute details of similar objects. These mechanisms often assume that critical locations have a similar scale and are uniquely localizable, which is not always accurate. For instance, the size of a bird may vary across images, and the color of its beak might only be significant for species identification when its wing and tail colors are specific. This paper addresses this limitation by proposing a so-called hierarchical feature attention learning network, which initially focuses on the target object within the image, followed by multi-headed attention to identify key discriminative locations (patches). Especially, we develop a novel hierarchical attention approach that appropriately reduces misleading attentions by considering the object’s size for capturing correct attention parts. In addition, the proposed multi-headed attention allows for examining more complementary attention parts to identify the most discriminative features. Further, our framework is implemented as an architectural constraint, eliminating the need for object or part-level annotations in a weakly supervised detection manner. We conducted extensive and comparative experiments on three benchmark datasets: NABirds, CUB-200, and Oxford 102 Flower. The results demonstrate that our proposed hierarchical attention approach provides a robust and efficient solution for improved FGVC. Specifically, our method achieved a top-1 accuracy increase of approximately 93.0%, 92.7%, and 99.4% on the CUB-200-2011, NABirds, and Oxford 102 Flower benchmarks, respectively.https://ieeexplore.ieee.org/document/10854460/Fine-grained visual classification (FGVC)hierarchical feature attention learningweakly supervise detectionmulti-head attention
spellingShingle A. Yeong Han
Kwang Moo Yi
Kyeong Tae Kim
Jae Young Choi
Hierarchical Feature Attention Learning Network for Detecting Object and Discriminative Parts in Fine-Grained Visual Classification
IEEE Access
Fine-grained visual classification (FGVC)
hierarchical feature attention learning
weakly supervise detection
multi-head attention
title Hierarchical Feature Attention Learning Network for Detecting Object and Discriminative Parts in Fine-Grained Visual Classification
title_full Hierarchical Feature Attention Learning Network for Detecting Object and Discriminative Parts in Fine-Grained Visual Classification
title_fullStr Hierarchical Feature Attention Learning Network for Detecting Object and Discriminative Parts in Fine-Grained Visual Classification
title_full_unstemmed Hierarchical Feature Attention Learning Network for Detecting Object and Discriminative Parts in Fine-Grained Visual Classification
title_short Hierarchical Feature Attention Learning Network for Detecting Object and Discriminative Parts in Fine-Grained Visual Classification
title_sort hierarchical feature attention learning network for detecting object and discriminative parts in fine grained visual classification
topic Fine-grained visual classification (FGVC)
hierarchical feature attention learning
weakly supervise detection
multi-head attention
url https://ieeexplore.ieee.org/document/10854460/
work_keys_str_mv AT ayeonghan hierarchicalfeatureattentionlearningnetworkfordetectingobjectanddiscriminativepartsinfinegrainedvisualclassification
AT kwangmooyi hierarchicalfeatureattentionlearningnetworkfordetectingobjectanddiscriminativepartsinfinegrainedvisualclassification
AT kyeongtaekim hierarchicalfeatureattentionlearningnetworkfordetectingobjectanddiscriminativepartsinfinegrainedvisualclassification
AT jaeyoungchoi hierarchicalfeatureattentionlearningnetworkfordetectingobjectanddiscriminativepartsinfinegrainedvisualclassification