Deep Ensemble Learning for Human Action Recognition in Still Images

Numerous human actions such as “Phoning,” “PlayingGuitar,” and “RidingHorse” can be inferred by static cue-based approaches even if their motions in video are available considering one single still image may already sufficiently explain a particular action. In this research, we investigate human act...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiangchun Yu, Zhe Zhang, Lei Wu, Wei Pang, Hechang Chen, Zhezhou Yu, Bin Li
Format:	Article
Language:	English
Published:	Wiley 2020-01-01
Series:	Complexity
Online Access:	http://dx.doi.org/10.1155/2020/9428612
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832566444522471424
author	Xiangchun Yu Zhe Zhang Lei Wu Wei Pang Hechang Chen Zhezhou Yu Bin Li
author_facet	Xiangchun Yu Zhe Zhang Lei Wu Wei Pang Hechang Chen Zhezhou Yu Bin Li
author_sort	Xiangchun Yu
collection	DOAJ
description	Numerous human actions such as “Phoning,” “PlayingGuitar,” and “RidingHorse” can be inferred by static cue-based approaches even if their motions in video are available considering one single still image may already sufficiently explain a particular action. In this research, we investigate human action recognition in still images and utilize deep ensemble learning to automatically decompose the body pose and perceive its background information. Firstly, we construct an end-to-end NCNN-based model by attaching the nonsequential convolutional neural network (NCNN) module to the top of the pretrained model. The nonsequential network topology of NCNN can separately learn the spatial- and channel-wise features with parallel branches, which helps improve the model performance. Subsequently, in order to further exploit the advantage of the nonsequential topology, we propose an end-to-end deep ensemble learning based on the weight optimization (DELWO) model. It contributes to fusing the deep information derived from multiple models automatically from the data. Finally, we design the deep ensemble learning based on voting strategy (DELVS) model to pool together multiple deep models with weighted coefficients to obtain a better prediction. More importantly, the model complexity can be reduced by lessening the number of trainable parameters, thereby effectively mitigating overfitting issues of the model in small datasets to some extent. We conduct experiments in Li’s action dataset, uncropped and 1.5x cropped Willow action datasets, and the results have validated the effectiveness and robustness of our proposed models in terms of mitigating overfitting issues in small datasets. Finally, we open source our code for the model in GitHub (https://github.com/yxchspring/deep_ensemble_learning) in order to share our model with the community.
format	Article
id	doaj-art-dd39402d644b41e198256d1c9c7005e3
institution	Kabale University
issn	1076-2787 1099-0526
language	English
publishDate	2020-01-01
publisher	Wiley
record_format	Article
series	Complexity
spelling	doaj-art-dd39402d644b41e198256d1c9c7005e32025-02-03T01:04:15ZengWileyComplexity1076-27871099-05262020-01-01202010.1155/2020/94286129428612Deep Ensemble Learning for Human Action Recognition in Still ImagesXiangchun Yu0Zhe Zhang1Lei Wu2Wei Pang3Hechang Chen4Zhezhou Yu5Bin Li6School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, ChinaCollege of Computer Science and Technology, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Jilin University, Changchun 130012, ChinaSchool of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UKCollege of Computer Science and Technology, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Jilin University, Changchun 130012, ChinaSchool of Information Engineering, Northeast Electric Power University, Jilin 132012, ChinaNumerous human actions such as “Phoning,” “PlayingGuitar,” and “RidingHorse” can be inferred by static cue-based approaches even if their motions in video are available considering one single still image may already sufficiently explain a particular action. In this research, we investigate human action recognition in still images and utilize deep ensemble learning to automatically decompose the body pose and perceive its background information. Firstly, we construct an end-to-end NCNN-based model by attaching the nonsequential convolutional neural network (NCNN) module to the top of the pretrained model. The nonsequential network topology of NCNN can separately learn the spatial- and channel-wise features with parallel branches, which helps improve the model performance. Subsequently, in order to further exploit the advantage of the nonsequential topology, we propose an end-to-end deep ensemble learning based on the weight optimization (DELWO) model. It contributes to fusing the deep information derived from multiple models automatically from the data. Finally, we design the deep ensemble learning based on voting strategy (DELVS) model to pool together multiple deep models with weighted coefficients to obtain a better prediction. More importantly, the model complexity can be reduced by lessening the number of trainable parameters, thereby effectively mitigating overfitting issues of the model in small datasets to some extent. We conduct experiments in Li’s action dataset, uncropped and 1.5x cropped Willow action datasets, and the results have validated the effectiveness and robustness of our proposed models in terms of mitigating overfitting issues in small datasets. Finally, we open source our code for the model in GitHub (https://github.com/yxchspring/deep_ensemble_learning) in order to share our model with the community.http://dx.doi.org/10.1155/2020/9428612
spellingShingle	Xiangchun Yu Zhe Zhang Lei Wu Wei Pang Hechang Chen Zhezhou Yu Bin Li Deep Ensemble Learning for Human Action Recognition in Still Images Complexity
title	Deep Ensemble Learning for Human Action Recognition in Still Images
title_full	Deep Ensemble Learning for Human Action Recognition in Still Images
title_fullStr	Deep Ensemble Learning for Human Action Recognition in Still Images
title_full_unstemmed	Deep Ensemble Learning for Human Action Recognition in Still Images
title_short	Deep Ensemble Learning for Human Action Recognition in Still Images
title_sort	deep ensemble learning for human action recognition in still images
url	http://dx.doi.org/10.1155/2020/9428612
work_keys_str_mv	AT xiangchunyu deepensemblelearningforhumanactionrecognitioninstillimages AT zhezhang deepensemblelearningforhumanactionrecognitioninstillimages AT leiwu deepensemblelearningforhumanactionrecognitioninstillimages AT weipang deepensemblelearningforhumanactionrecognitioninstillimages AT hechangchen deepensemblelearningforhumanactionrecognitioninstillimages AT zhezhouyu deepensemblelearningforhumanactionrecognitioninstillimages AT binli deepensemblelearningforhumanactionrecognitioninstillimages

Deep Ensemble Learning for Human Action Recognition in Still Images

Similar Items