Fall detection method based on spatio-temporal coordinate attention for high-resolution networks

Abstract Fall behavior is closely related to the high mortality rate of the elderly, so fall detection has become an important and urgent research area in human behavior recognition. However, the existing fall detection methods, suffer from the loss of detailed action information during feature extr...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaorui Zhang, Qijian Xie, Wei Sun, Ting Wang
Format: Article
Language:English
Published: Springer 2024-11-01
Series:Complex & Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1007/s40747-024-01660-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832571166421680128
author Xiaorui Zhang
Qijian Xie
Wei Sun
Ting Wang
author_facet Xiaorui Zhang
Qijian Xie
Wei Sun
Ting Wang
author_sort Xiaorui Zhang
collection DOAJ
description Abstract Fall behavior is closely related to the high mortality rate of the elderly, so fall detection has become an important and urgent research area in human behavior recognition. However, the existing fall detection methods, suffer from the loss of detailed action information during feature extraction due to the downsampling operation, resulting in subpar performance when detecting falls with similar behaviors such as lying and sitting. To solve the challenges, this study proposes a high-resolution spatio-temporal feature extraction method based on a spatio-temporal coordinate attention mechanism. The method employs 3D convolutions to extract spatio-temporal features and utilizes gradual down-sampling to generate a multi-resolution sub-network, thus realizing multi-scale fusion and perception enhancement of details. In particular, this study designs a pseudo-3D basic block, which simulates the ability of 3D convolution, to ensure the running speed of the network while controlling the number of parameters. Further, a spatio-temporal coordinate attention mechanism is designed to accurately extract the spatio-temporal positional changes of key skeletal points and the interrelationships among them. Long-term dependencies in horizontal, vertical, temporal directions are captured through three one-dimensional global pooling operations. Then the long-range relationships and channel correlations among features are captured by cascading and slicing operations. Finally, the key information is effectively highlighted by performing dot-multiplication operations between the feature maps from the horizontal, vertical and temporal directions and the input feature maps. Experimental results on three typical public datasets show that the proposed method can better extract motion features and improve the accuracy of fall detection.
format Article
id doaj-art-5651f9e4c6194d8eafc210554cee29d8
institution Kabale University
issn 2199-4536
2198-6053
language English
publishDate 2024-11-01
publisher Springer
record_format Article
series Complex & Intelligent Systems
spelling doaj-art-5651f9e4c6194d8eafc210554cee29d82025-02-02T12:49:04ZengSpringerComplex & Intelligent Systems2199-45362198-60532024-11-0111111310.1007/s40747-024-01660-4Fall detection method based on spatio-temporal coordinate attention for high-resolution networksXiaorui Zhang0Qijian Xie1Wei Sun2Ting Wang3College of Computer and Information Engineering, Nanjing Tech UniversitySchool of Computer Science, Nanjing University of Information Science & TechnologySchool of Automation, Nanjing University of Information Science & TechnologyCollege of Electrical Engineering and Control Science, Nanjing Tech UniversityAbstract Fall behavior is closely related to the high mortality rate of the elderly, so fall detection has become an important and urgent research area in human behavior recognition. However, the existing fall detection methods, suffer from the loss of detailed action information during feature extraction due to the downsampling operation, resulting in subpar performance when detecting falls with similar behaviors such as lying and sitting. To solve the challenges, this study proposes a high-resolution spatio-temporal feature extraction method based on a spatio-temporal coordinate attention mechanism. The method employs 3D convolutions to extract spatio-temporal features and utilizes gradual down-sampling to generate a multi-resolution sub-network, thus realizing multi-scale fusion and perception enhancement of details. In particular, this study designs a pseudo-3D basic block, which simulates the ability of 3D convolution, to ensure the running speed of the network while controlling the number of parameters. Further, a spatio-temporal coordinate attention mechanism is designed to accurately extract the spatio-temporal positional changes of key skeletal points and the interrelationships among them. Long-term dependencies in horizontal, vertical, temporal directions are captured through three one-dimensional global pooling operations. Then the long-range relationships and channel correlations among features are captured by cascading and slicing operations. Finally, the key information is effectively highlighted by performing dot-multiplication operations between the feature maps from the horizontal, vertical and temporal directions and the input feature maps. Experimental results on three typical public datasets show that the proposed method can better extract motion features and improve the accuracy of fall detection.https://doi.org/10.1007/s40747-024-01660-4Fall detectionSimilar behaviorsPseudo-3D basic blockSpatio-temporal coordinate attention
spellingShingle Xiaorui Zhang
Qijian Xie
Wei Sun
Ting Wang
Fall detection method based on spatio-temporal coordinate attention for high-resolution networks
Complex & Intelligent Systems
Fall detection
Similar behaviors
Pseudo-3D basic block
Spatio-temporal coordinate attention
title Fall detection method based on spatio-temporal coordinate attention for high-resolution networks
title_full Fall detection method based on spatio-temporal coordinate attention for high-resolution networks
title_fullStr Fall detection method based on spatio-temporal coordinate attention for high-resolution networks
title_full_unstemmed Fall detection method based on spatio-temporal coordinate attention for high-resolution networks
title_short Fall detection method based on spatio-temporal coordinate attention for high-resolution networks
title_sort fall detection method based on spatio temporal coordinate attention for high resolution networks
topic Fall detection
Similar behaviors
Pseudo-3D basic block
Spatio-temporal coordinate attention
url https://doi.org/10.1007/s40747-024-01660-4
work_keys_str_mv AT xiaoruizhang falldetectionmethodbasedonspatiotemporalcoordinateattentionforhighresolutionnetworks
AT qijianxie falldetectionmethodbasedonspatiotemporalcoordinateattentionforhighresolutionnetworks
AT weisun falldetectionmethodbasedonspatiotemporalcoordinateattentionforhighresolutionnetworks
AT tingwang falldetectionmethodbasedonspatiotemporalcoordinateattentionforhighresolutionnetworks