Prior-free 3D human pose estimation in a video using limb-vectors

Estimating accurate 3D human poses from a monocular video is fundamental to various computer vision tasks. Existing methods exploit 2D-to-3D pose lifting, multiview images, and depth sensors to model spatio-temporal dependencies. However, depth ambiguities, occlusions, and larger temporal receptive...

Full description

Saved in:
Bibliographic Details
Main Authors: Anam Memon, Qasim Arain, Nasrullah Pirzada, Akram Shaikh, Adel Sulaiman, Mana Saleh Al Reshan, Hani Alshahrani, Asadullah Shaikh
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:ICT Express
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405959524001188
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Estimating accurate 3D human poses from a monocular video is fundamental to various computer vision tasks. Existing methods exploit 2D-to-3D pose lifting, multiview images, and depth sensors to model spatio-temporal dependencies. However, depth ambiguities, occlusions, and larger temporal receptive fields pose challenges to these approaches. To address this, we propose a novel prior-free DCNN-based 3D human pose estimation method for monocular image sequences using limb vectors. Our method comprises two subnetworks: a limb direction estimator and a limb length estimator. The limb direction estimator utilizes a fully convolutional network to model limb direction vectors across a temporal window. We show that network complexity can be significantly reduced by utilizing dilated convolutional operations and a relatively smaller receptive field while maintaining estimation accuracy. Moreover, the limb length estimator captures stable limb length estimations from a reliable frame set. Our model has shown superior performance compared to existing methods on the Human3.6M and MPI-INF-3DHP datasets.
ISSN:2405-9595