OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning

Human action recognition has become crucial in computer vision, with growing applications in surveillance, human–computer interaction, and healthcare. Traditional approaches often use broad feature representations, which may miss subtle variations in timing and movement within action sequences. Our...

Full description

Saved in:

Bibliographic Details
Main Authors:	Muhammad Usman, Wenming Cao, Zhao Huang, Jianqi Zhong, Ruiya Ji
Format:	Article
Language:	English
Published:	MDPI AG 2024-11-01
Series:	AI
Subjects:	skeleton-based action representation learning unsupervised learning hierarchical contrastive learning one-to-many
Online Access:	https://www.mdpi.com/2673-2688/5/4/106
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850050075924889600
author	Muhammad Usman Wenming Cao Zhao Huang Jianqi Zhong Ruiya Ji
author_facet	Muhammad Usman Wenming Cao Zhao Huang Jianqi Zhong Ruiya Ji
author_sort	Muhammad Usman
collection	DOAJ
description	Human action recognition has become crucial in computer vision, with growing applications in surveillance, human–computer interaction, and healthcare. Traditional approaches often use broad feature representations, which may miss subtle variations in timing and movement within action sequences. Our proposed One-to-Many Hierarchical Contrastive Learning (OTM-HC) framework maps the input into multi-layered feature vectors, creating a hierarchical contrast representation that captures various granularities within a human skeleton sequence temporal and spatial domains. Using sequence-to-sequence (Seq2Seq) transformer encoders and downsampling modules, OTM-HC can distinguish between multiple levels of action representations, such as instance, domain, clip, and part levels. Each level contributes significantly to a comprehensive understanding of action representations. The OTM-HC model design is adaptable, ensuring smooth integration with advanced Seq2Seq encoders. We tested the OTM-HC framework across four datasets, demonstrating improved performance over state-of-the-art models. Specifically, OTM-HC achieved improvements of 0.9% and 0.6% on NTU60, 0.4% and 0.7% on NTU120, and 0.7% and 0.3% on PKU-MMD I and II, respectively, surpassing previous leading approaches across these datasets. These results showcase the robustness and adaptability of our model for various skeleton-based action recognition tasks.
format	Article
id	doaj-art-750123ff22af4e00a2b53140708a1992
institution	DOAJ
issn	2673-2688
language	English
publishDate	2024-11-01
publisher	MDPI AG
record_format	Article
series	AI
spelling	doaj-art-750123ff22af4e00a2b53140708a19922025-08-20T02:53:34ZengMDPI AGAI2673-26882024-11-01542170218610.3390/ai5040106OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive LearningMuhammad Usman0Wenming Cao1Zhao Huang2Jianqi Zhong3Ruiya Ji4College of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060, ChinaCollege of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060, ChinaDepartment of Computer and Information Science, Northumbria University, Newcastle NE1 8ST, UKCollege of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060, ChinaDepartment of Computer Science, Queen Mary University of London, London E1 4NS, UKHuman action recognition has become crucial in computer vision, with growing applications in surveillance, human–computer interaction, and healthcare. Traditional approaches often use broad feature representations, which may miss subtle variations in timing and movement within action sequences. Our proposed One-to-Many Hierarchical Contrastive Learning (OTM-HC) framework maps the input into multi-layered feature vectors, creating a hierarchical contrast representation that captures various granularities within a human skeleton sequence temporal and spatial domains. Using sequence-to-sequence (Seq2Seq) transformer encoders and downsampling modules, OTM-HC can distinguish between multiple levels of action representations, such as instance, domain, clip, and part levels. Each level contributes significantly to a comprehensive understanding of action representations. The OTM-HC model design is adaptable, ensuring smooth integration with advanced Seq2Seq encoders. We tested the OTM-HC framework across four datasets, demonstrating improved performance over state-of-the-art models. Specifically, OTM-HC achieved improvements of 0.9% and 0.6% on NTU60, 0.4% and 0.7% on NTU120, and 0.7% and 0.3% on PKU-MMD I and II, respectively, surpassing previous leading approaches across these datasets. These results showcase the robustness and adaptability of our model for various skeleton-based action recognition tasks.https://www.mdpi.com/2673-2688/5/4/106skeleton-based action representation learningunsupervised learninghierarchical contrastive learningone-to-many
spellingShingle	Muhammad Usman Wenming Cao Zhao Huang Jianqi Zhong Ruiya Ji OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning AI skeleton-based action representation learning unsupervised learning hierarchical contrastive learning one-to-many
title	OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning
title_full	OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning
title_fullStr	OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning
title_full_unstemmed	OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning
title_short	OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning
title_sort	otm hc enhanced skeleton based action representation via one to many hierarchical contrastive learning
topic	skeleton-based action representation learning unsupervised learning hierarchical contrastive learning one-to-many
url	https://www.mdpi.com/2673-2688/5/4/106
work_keys_str_mv	AT muhammadusman otmhcenhancedskeletonbasedactionrepresentationviaonetomanyhierarchicalcontrastivelearning AT wenmingcao otmhcenhancedskeletonbasedactionrepresentationviaonetomanyhierarchicalcontrastivelearning AT zhaohuang otmhcenhancedskeletonbasedactionrepresentationviaonetomanyhierarchicalcontrastivelearning AT jianqizhong otmhcenhancedskeletonbasedactionrepresentationviaonetomanyhierarchicalcontrastivelearning AT ruiyaji otmhcenhancedskeletonbasedactionrepresentationviaonetomanyhierarchicalcontrastivelearning

OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning

Similar Items