EMSPAN: Efficient Multi-Scale Pyramid Attention Network for Object Counting Under Size Heterogeneity and Dense Scenarios

Computer vision is becoming an increasingly vital field, offering significant opportunities for real-world applications. Object counting is one of its core aspects, with increasing utilization across scientific fields involving objects of varying sizes. Traditional counting methods, however, face ch...

Full description

Saved in:
Bibliographic Details
Main Authors: Phu Nguyen Phan Hai, Bao Bui Quoc, Trang Hoang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10851276/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576750383529984
author Phu Nguyen Phan Hai
Bao Bui Quoc
Trang Hoang
author_facet Phu Nguyen Phan Hai
Bao Bui Quoc
Trang Hoang
author_sort Phu Nguyen Phan Hai
collection DOAJ
description Computer vision is becoming an increasingly vital field, offering significant opportunities for real-world applications. Object counting is one of its core aspects, with increasing utilization across scientific fields involving objects of varying sizes. Traditional counting methods, however, face challenges in dense scenarios, as they are often ineffective in handling objects of different sizes. To address these challenges, this paper proposes the Efficient Multi-Scale Pyramid Attention Network (EMSPAN) model, which is designed to tackle both dense and size-heterogeneous object counting tasks. Additionally, a novel ground truth density map generation method using size-adaptive Gaussian kernels is introduced, which dynamically adjusts kernel size based on object dimensions. This approach preserves spatial information more effectively and produces more accurate density maps, even in complex scenes. The EMSPAN model utilizes advanced attention mechanisms to capture the multi-scale spatial distribution and size variations of objects. Experiments on the shrimp larvae and crowd datasets, characterized by significant size diversity of individual objects, have demonstrated the superior performance of the proposed method in handling object counting tasks in dense and size-heterogeneous environments.
format Article
id doaj-art-3f0f0f007db0470e809d082d88d4910a
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-3f0f0f007db0470e809d082d88d4910a2025-01-31T00:01:08ZengIEEEIEEE Access2169-35362025-01-0113179451796210.1109/ACCESS.2025.353296210851276EMSPAN: Efficient Multi-Scale Pyramid Attention Network for Object Counting Under Size Heterogeneity and Dense ScenariosPhu Nguyen Phan Hai0https://orcid.org/0000-0002-8667-7746Bao Bui Quoc1https://orcid.org/0000-0002-8467-0532Trang Hoang2https://orcid.org/0000-0001-7317-9708Department of Electronics, Faculty of Electrical and Electronics Engineering, Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City, VietnamDepartment of Electronics, Faculty of Electrical and Electronics Engineering, Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City, VietnamDepartment of Electronics, Faculty of Electrical and Electronics Engineering, Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City, VietnamComputer vision is becoming an increasingly vital field, offering significant opportunities for real-world applications. Object counting is one of its core aspects, with increasing utilization across scientific fields involving objects of varying sizes. Traditional counting methods, however, face challenges in dense scenarios, as they are often ineffective in handling objects of different sizes. To address these challenges, this paper proposes the Efficient Multi-Scale Pyramid Attention Network (EMSPAN) model, which is designed to tackle both dense and size-heterogeneous object counting tasks. Additionally, a novel ground truth density map generation method using size-adaptive Gaussian kernels is introduced, which dynamically adjusts kernel size based on object dimensions. This approach preserves spatial information more effectively and produces more accurate density maps, even in complex scenes. The EMSPAN model utilizes advanced attention mechanisms to capture the multi-scale spatial distribution and size variations of objects. Experiments on the shrimp larvae and crowd datasets, characterized by significant size diversity of individual objects, have demonstrated the superior performance of the proposed method in handling object counting tasks in dense and size-heterogeneous environments.https://ieeexplore.ieee.org/document/10851276/Computer visiondensity estimationsize-adjustable Gaussian kerneldeep learningconvolutional neural networksattention mechanisms
spellingShingle Phu Nguyen Phan Hai
Bao Bui Quoc
Trang Hoang
EMSPAN: Efficient Multi-Scale Pyramid Attention Network for Object Counting Under Size Heterogeneity and Dense Scenarios
IEEE Access
Computer vision
density estimation
size-adjustable Gaussian kernel
deep learning
convolutional neural networks
attention mechanisms
title EMSPAN: Efficient Multi-Scale Pyramid Attention Network for Object Counting Under Size Heterogeneity and Dense Scenarios
title_full EMSPAN: Efficient Multi-Scale Pyramid Attention Network for Object Counting Under Size Heterogeneity and Dense Scenarios
title_fullStr EMSPAN: Efficient Multi-Scale Pyramid Attention Network for Object Counting Under Size Heterogeneity and Dense Scenarios
title_full_unstemmed EMSPAN: Efficient Multi-Scale Pyramid Attention Network for Object Counting Under Size Heterogeneity and Dense Scenarios
title_short EMSPAN: Efficient Multi-Scale Pyramid Attention Network for Object Counting Under Size Heterogeneity and Dense Scenarios
title_sort emspan efficient multi scale pyramid attention network for object counting under size heterogeneity and dense scenarios
topic Computer vision
density estimation
size-adjustable Gaussian kernel
deep learning
convolutional neural networks
attention mechanisms
url https://ieeexplore.ieee.org/document/10851276/
work_keys_str_mv AT phunguyenphanhai emspanefficientmultiscalepyramidattentionnetworkforobjectcountingundersizeheterogeneityanddensescenarios
AT baobuiquoc emspanefficientmultiscalepyramidattentionnetworkforobjectcountingundersizeheterogeneityanddensescenarios
AT tranghoang emspanefficientmultiscalepyramidattentionnetworkforobjectcountingundersizeheterogeneityanddensescenarios