TY - GEN
T1 - Spatio-Temporal Deviation Calibration for Skeleton-Based Human Action Recognition
AU - Freixas, Gerard Marcos
AU - Feng, Zunlei
AU - Han, Kelvin Ting Zuo
AU - Jin, Cheng
AU - Hu, Jiacong
AU - Lei, Jie
AU - Wu, Xingjiao
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Human Action Recognition (HAR) plays an important role in various applications such as video surveillance, human-computer interaction, and healthcare. In recent years, there has been increasing interest in using skeleton-based representations for HAR due to their robustness to changes in appearance and viewpoints. However, skeleton-based approaches suffer from the class similarity problem due to the high intraclass variability and low interclass separability, because of the inherent structure of the human skeleton. Graph Convolutional Networks (GCNs) have reached remarkable results, modeling spatio-temporal relationships of a skeleton sequence. Motivated by the class similarity problem and the lack of a GCN optimization tool for skeleton-based HAR context, we propose a simple gradient-based re-training pipeline applicable to any state-of-the-art GCN model that constrains low-confidence predictions, guiding the model towards the ground-truth categories. By focusing on the relevant frames and nodes of each category, certain incorrect patterns yielded by these low-confidence samples are ignored, leading to a notable optimization of the model. Experimental results demonstrate the flexibility and effectiveness of the proposed method, improving up to 3% the accuracy of mainstream GCN recognizers.
AB - Human Action Recognition (HAR) plays an important role in various applications such as video surveillance, human-computer interaction, and healthcare. In recent years, there has been increasing interest in using skeleton-based representations for HAR due to their robustness to changes in appearance and viewpoints. However, skeleton-based approaches suffer from the class similarity problem due to the high intraclass variability and low interclass separability, because of the inherent structure of the human skeleton. Graph Convolutional Networks (GCNs) have reached remarkable results, modeling spatio-temporal relationships of a skeleton sequence. Motivated by the class similarity problem and the lack of a GCN optimization tool for skeleton-based HAR context, we propose a simple gradient-based re-training pipeline applicable to any state-of-the-art GCN model that constrains low-confidence predictions, guiding the model towards the ground-truth categories. By focusing on the relevant frames and nodes of each category, certain incorrect patterns yielded by these low-confidence samples are ignored, leading to a notable optimization of the model. Experimental results demonstrate the flexibility and effectiveness of the proposed method, improving up to 3% the accuracy of mainstream GCN recognizers.
KW - class similarity problem
KW - graph convolutional network
KW - skeleton-based human action recognition
UR - https://www.scopus.com/pages/publications/105009403683
U2 - 10.1007/978-981-96-6963-9_13
DO - 10.1007/978-981-96-6963-9_13
M3 - 会议稿件
AN - SCOPUS:105009403683
SN - 9789819669622
T3 - Communications in Computer and Information Science
SP - 180
EP - 194
BT - Neural Information Processing - 31st International Conference, ICONIP 2024, Proceedings
A2 - Mahmud, Mufti
A2 - Doborjeh, Maryam
A2 - Wong, Kevin
A2 - Leung, Andrew Chi Sing
A2 - Doborjeh, Zohreh
A2 - Tanveer, M.
PB - Springer Science and Business Media Deutschland GmbH
T2 - 31st International Conference on Neural Information Processing, ICONIP 2024
Y2 - 2 December 2024 through 6 December 2024
ER -