Fall detection method based on spatio-temporal coordinate attention for high-resolution networks
Abstract Fall behavior is closely related to the high mortality rate of the elderly, so fall detection has become an important and urgent research area in human behavior recognition. However, the existing fall detection methods, suffer from the loss of detailed action information during feature extr...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2024-11-01
|
Series: | Complex & Intelligent Systems |
Subjects: | |
Online Access: | https://doi.org/10.1007/s40747-024-01660-4 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832571166421680128 |
---|---|
author | Xiaorui Zhang Qijian Xie Wei Sun Ting Wang |
author_facet | Xiaorui Zhang Qijian Xie Wei Sun Ting Wang |
author_sort | Xiaorui Zhang |
collection | DOAJ |
description | Abstract Fall behavior is closely related to the high mortality rate of the elderly, so fall detection has become an important and urgent research area in human behavior recognition. However, the existing fall detection methods, suffer from the loss of detailed action information during feature extraction due to the downsampling operation, resulting in subpar performance when detecting falls with similar behaviors such as lying and sitting. To solve the challenges, this study proposes a high-resolution spatio-temporal feature extraction method based on a spatio-temporal coordinate attention mechanism. The method employs 3D convolutions to extract spatio-temporal features and utilizes gradual down-sampling to generate a multi-resolution sub-network, thus realizing multi-scale fusion and perception enhancement of details. In particular, this study designs a pseudo-3D basic block, which simulates the ability of 3D convolution, to ensure the running speed of the network while controlling the number of parameters. Further, a spatio-temporal coordinate attention mechanism is designed to accurately extract the spatio-temporal positional changes of key skeletal points and the interrelationships among them. Long-term dependencies in horizontal, vertical, temporal directions are captured through three one-dimensional global pooling operations. Then the long-range relationships and channel correlations among features are captured by cascading and slicing operations. Finally, the key information is effectively highlighted by performing dot-multiplication operations between the feature maps from the horizontal, vertical and temporal directions and the input feature maps. Experimental results on three typical public datasets show that the proposed method can better extract motion features and improve the accuracy of fall detection. |
format | Article |
id | doaj-art-5651f9e4c6194d8eafc210554cee29d8 |
institution | Kabale University |
issn | 2199-4536 2198-6053 |
language | English |
publishDate | 2024-11-01 |
publisher | Springer |
record_format | Article |
series | Complex & Intelligent Systems |
spelling | doaj-art-5651f9e4c6194d8eafc210554cee29d82025-02-02T12:49:04ZengSpringerComplex & Intelligent Systems2199-45362198-60532024-11-0111111310.1007/s40747-024-01660-4Fall detection method based on spatio-temporal coordinate attention for high-resolution networksXiaorui Zhang0Qijian Xie1Wei Sun2Ting Wang3College of Computer and Information Engineering, Nanjing Tech UniversitySchool of Computer Science, Nanjing University of Information Science & TechnologySchool of Automation, Nanjing University of Information Science & TechnologyCollege of Electrical Engineering and Control Science, Nanjing Tech UniversityAbstract Fall behavior is closely related to the high mortality rate of the elderly, so fall detection has become an important and urgent research area in human behavior recognition. However, the existing fall detection methods, suffer from the loss of detailed action information during feature extraction due to the downsampling operation, resulting in subpar performance when detecting falls with similar behaviors such as lying and sitting. To solve the challenges, this study proposes a high-resolution spatio-temporal feature extraction method based on a spatio-temporal coordinate attention mechanism. The method employs 3D convolutions to extract spatio-temporal features and utilizes gradual down-sampling to generate a multi-resolution sub-network, thus realizing multi-scale fusion and perception enhancement of details. In particular, this study designs a pseudo-3D basic block, which simulates the ability of 3D convolution, to ensure the running speed of the network while controlling the number of parameters. Further, a spatio-temporal coordinate attention mechanism is designed to accurately extract the spatio-temporal positional changes of key skeletal points and the interrelationships among them. Long-term dependencies in horizontal, vertical, temporal directions are captured through three one-dimensional global pooling operations. Then the long-range relationships and channel correlations among features are captured by cascading and slicing operations. Finally, the key information is effectively highlighted by performing dot-multiplication operations between the feature maps from the horizontal, vertical and temporal directions and the input feature maps. Experimental results on three typical public datasets show that the proposed method can better extract motion features and improve the accuracy of fall detection.https://doi.org/10.1007/s40747-024-01660-4Fall detectionSimilar behaviorsPseudo-3D basic blockSpatio-temporal coordinate attention |
spellingShingle | Xiaorui Zhang Qijian Xie Wei Sun Ting Wang Fall detection method based on spatio-temporal coordinate attention for high-resolution networks Complex & Intelligent Systems Fall detection Similar behaviors Pseudo-3D basic block Spatio-temporal coordinate attention |
title | Fall detection method based on spatio-temporal coordinate attention for high-resolution networks |
title_full | Fall detection method based on spatio-temporal coordinate attention for high-resolution networks |
title_fullStr | Fall detection method based on spatio-temporal coordinate attention for high-resolution networks |
title_full_unstemmed | Fall detection method based on spatio-temporal coordinate attention for high-resolution networks |
title_short | Fall detection method based on spatio-temporal coordinate attention for high-resolution networks |
title_sort | fall detection method based on spatio temporal coordinate attention for high resolution networks |
topic | Fall detection Similar behaviors Pseudo-3D basic block Spatio-temporal coordinate attention |
url | https://doi.org/10.1007/s40747-024-01660-4 |
work_keys_str_mv | AT xiaoruizhang falldetectionmethodbasedonspatiotemporalcoordinateattentionforhighresolutionnetworks AT qijianxie falldetectionmethodbasedonspatiotemporalcoordinateattentionforhighresolutionnetworks AT weisun falldetectionmethodbasedonspatiotemporalcoordinateattentionforhighresolutionnetworks AT tingwang falldetectionmethodbasedonspatiotemporalcoordinateattentionforhighresolutionnetworks |