TY - GEN
T1 - RSTAN
T2 - 27th International Conference on Pattern Recognition, ICPR 2024
AU - Jiang, Yaru
AU - Lyu, Shujing
AU - Zhan, Hongjian
AU - Lu, Yue
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - The occurrence of human fall is a significant threat to human health, especially among the elderly. Unlike standard action recognition, falls manifest a combination of static and dynamic attributes. They are highly sensitive to spatio-temporal motion, marked by sudden and transient occurrences. This paper proposes a novel spatio-temporal convolutional method for end-to-end human fall detection, named Residual Spatio-Temporal Attention Network (RSTAN). The network integrates a Spatial Channel Attention (SCA) module within the convolutional layers of the Residual 3D convolution to enhance feature refinement. selectively accentuates spatial and channel dimensions. In addition, to capture both the extensive spatio-temporal features and the short-range spatio-temporal characteristics of human falls, effectively distinguishing them from daily activities, we propose a Multi-interval Difference Aggregation (MDA) method. The MDA utilizes multiple time interval frame differences to extract motion features. Our proposed method’s superior performance is demonstrated through experiments on three publicly available fall detection datasets. Specifically, achieving 100% accuracy on the UR Fall Detection dataset.
AB - The occurrence of human fall is a significant threat to human health, especially among the elderly. Unlike standard action recognition, falls manifest a combination of static and dynamic attributes. They are highly sensitive to spatio-temporal motion, marked by sudden and transient occurrences. This paper proposes a novel spatio-temporal convolutional method for end-to-end human fall detection, named Residual Spatio-Temporal Attention Network (RSTAN). The network integrates a Spatial Channel Attention (SCA) module within the convolutional layers of the Residual 3D convolution to enhance feature refinement. selectively accentuates spatial and channel dimensions. In addition, to capture both the extensive spatio-temporal features and the short-range spatio-temporal characteristics of human falls, effectively distinguishing them from daily activities, we propose a Multi-interval Difference Aggregation (MDA) method. The MDA utilizes multiple time interval frame differences to extract motion features. Our proposed method’s superior performance is demonstrated through experiments on three publicly available fall detection datasets. Specifically, achieving 100% accuracy on the UR Fall Detection dataset.
KW - Human fall detection
KW - Multi-interval difference aggregation
KW - Residual 3D convolution
KW - Spatial channel attention
UR - https://www.scopus.com/pages/publications/85212521172
U2 - 10.1007/978-3-031-78354-8_23
DO - 10.1007/978-3-031-78354-8_23
M3 - 会议稿件
AN - SCOPUS:85212521172
SN - 9783031783531
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 360
EP - 374
BT - Pattern Recognition - 27th International Conference, ICPR 2024, Proceedings
A2 - Antonacopoulos, Apostolos
A2 - Chaudhuri, Subhasis
A2 - Chellappa, Rama
A2 - Liu, Cheng-Lin
A2 - Bhattacharya, Saumik
A2 - Pal, Umapada
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 1 December 2024 through 5 December 2024
ER -