Image depth estimation assisted by multi-view projection

Abstract In recent years, deep learning has significantly advanced the development of image depth estimation algorithms. The depth estimation network with single-view input can only extract features from a single 2D image, often neglecting the information contained in neighboring views, resulting in...

Full description

Saved in:

Bibliographic Details
Main Authors:	Liman Liu, Jinshan Tian, Guansheng Luo, Siyuan Xu, Chen Zhang, Huaifei Hu, Wenbing Tao
Format:	Article
Language:	English
Published:	Springer 2024-12-01
Series:	Complex & Intelligent Systems
Subjects:	Multi-view projection Depth estimation Neural network Optical flow
Online Access:	https://doi.org/10.1007/s40747-024-01688-6
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832571198731452416
author	Liman Liu Jinshan Tian Guansheng Luo Siyuan Xu Chen Zhang Huaifei Hu Wenbing Tao
author_facet	Liman Liu Jinshan Tian Guansheng Luo Siyuan Xu Chen Zhang Huaifei Hu Wenbing Tao
author_sort	Liman Liu
collection	DOAJ
description	Abstract In recent years, deep learning has significantly advanced the development of image depth estimation algorithms. The depth estimation network with single-view input can only extract features from a single 2D image, often neglecting the information contained in neighboring views, resulting in learned features that lack real geometrical information in the 3D world and stricter constraints on the 3D structure, leading to limitations in the performance of image depth estimation. In the absence of accurate camera information, the multi-view geometric cues obtained by some methods may not accurately reflect the real 3D structure, resulting in a lack of multi-view geometric constraints in image depth estimation algorithms. To address this problem, a multi-view projection-assisted image depth estimation network is proposed, which integrates multi-view stereo vision into a deep learning-based encoding-decoding image depth estimation framework without pre-estimation of view bitmap. The network estimates optical flow for pixel-level matching across views, thereby projecting the features of neighboring views to the reference viewpoints for self-attentive feature aggregation, compensating for the lack of stereo geometry information in the image depth estimation framework. Additionally, a multi-view reprojection error is designed for supervised optical flow estimation to effectively constrain the optical flow estimation process. In addition, a long-distance attention decoding module is proposed to achieve effective extraction and aggregation of features in distant areas of the scene, which enhances the perception capability for outdoor long-distance. Experimental results on the KITTI dataset, vKITTI dataset, and SeasonDepth dataset demonstrate that our method achieves significant improvements compared to other state-of-the-art depth estimation techniques. This confirms its superior performance in image depth estimation.
format	Article
id	doaj-art-ceef544e93cc4e2abc2511f551cd7c90
institution	Kabale University
issn	2199-4536 2198-6053
language	English
publishDate	2024-12-01
publisher	Springer
record_format	Article
series	Complex & Intelligent Systems
spelling	doaj-art-ceef544e93cc4e2abc2511f551cd7c902025-02-02T12:48:49ZengSpringerComplex & Intelligent Systems2199-45362198-60532024-12-0111111610.1007/s40747-024-01688-6Image depth estimation assisted by multi-view projectionLiman Liu0Jinshan Tian1Guansheng Luo2Siyuan Xu3Chen Zhang4Huaifei Hu5Wenbing Tao6Key Laboratory of Cognitive Science, State EthnicAffairs Commission, Hubei Provincial Key Laboratory of Medical Information Analysis and Tumor Diagnosis and Treatment, School of Biomedical Engineering, South-Central Minzu UniversityKey Laboratory of Cognitive Science, State EthnicAffairs Commission, Hubei Provincial Key Laboratory of Medical Information Analysis and Tumor Diagnosis and Treatment, School of Biomedical Engineering, South-Central Minzu UniversityKey Laboratory of Cognitive Science, State EthnicAffairs Commission, Hubei Provincial Key Laboratory of Medical Information Analysis and Tumor Diagnosis and Treatment, School of Biomedical Engineering, South-Central Minzu UniversityNational Key Laboratory of Science and Technology on Multi-spectral Information Processing, School of Artificial Intelligence and Automation, Huazhong University of Science and TechnologyNational Key Laboratory of Science and Technology on Multi-spectral Information Processing, School of Artificial Intelligence and Automation, Huazhong University of Science and TechnologyKey Laboratory of Cognitive Science, State EthnicAffairs Commission, Hubei Provincial Key Laboratory of Medical Information Analysis and Tumor Diagnosis and Treatment, School of Biomedical Engineering, South-Central Minzu UniversityNational Key Laboratory of Science and Technology on Multi-spectral Information Processing, School of Artificial Intelligence and Automation, Huazhong University of Science and TechnologyAbstract In recent years, deep learning has significantly advanced the development of image depth estimation algorithms. The depth estimation network with single-view input can only extract features from a single 2D image, often neglecting the information contained in neighboring views, resulting in learned features that lack real geometrical information in the 3D world and stricter constraints on the 3D structure, leading to limitations in the performance of image depth estimation. In the absence of accurate camera information, the multi-view geometric cues obtained by some methods may not accurately reflect the real 3D structure, resulting in a lack of multi-view geometric constraints in image depth estimation algorithms. To address this problem, a multi-view projection-assisted image depth estimation network is proposed, which integrates multi-view stereo vision into a deep learning-based encoding-decoding image depth estimation framework without pre-estimation of view bitmap. The network estimates optical flow for pixel-level matching across views, thereby projecting the features of neighboring views to the reference viewpoints for self-attentive feature aggregation, compensating for the lack of stereo geometry information in the image depth estimation framework. Additionally, a multi-view reprojection error is designed for supervised optical flow estimation to effectively constrain the optical flow estimation process. In addition, a long-distance attention decoding module is proposed to achieve effective extraction and aggregation of features in distant areas of the scene, which enhances the perception capability for outdoor long-distance. Experimental results on the KITTI dataset, vKITTI dataset, and SeasonDepth dataset demonstrate that our method achieves significant improvements compared to other state-of-the-art depth estimation techniques. This confirms its superior performance in image depth estimation.https://doi.org/10.1007/s40747-024-01688-6Multi-view projectionDepth estimationNeural networkOptical flow
spellingShingle	Liman Liu Jinshan Tian Guansheng Luo Siyuan Xu Chen Zhang Huaifei Hu Wenbing Tao Image depth estimation assisted by multi-view projection Complex & Intelligent Systems Multi-view projection Depth estimation Neural network Optical flow
title	Image depth estimation assisted by multi-view projection
title_full	Image depth estimation assisted by multi-view projection
title_fullStr	Image depth estimation assisted by multi-view projection
title_full_unstemmed	Image depth estimation assisted by multi-view projection
title_short	Image depth estimation assisted by multi-view projection
title_sort	image depth estimation assisted by multi view projection
topic	Multi-view projection Depth estimation Neural network Optical flow
url	https://doi.org/10.1007/s40747-024-01688-6
work_keys_str_mv	AT limanliu imagedepthestimationassistedbymultiviewprojection AT jinshantian imagedepthestimationassistedbymultiviewprojection AT guanshengluo imagedepthestimationassistedbymultiviewprojection AT siyuanxu imagedepthestimationassistedbymultiviewprojection AT chenzhang imagedepthestimationassistedbymultiviewprojection AT huaifeihu imagedepthestimationassistedbymultiviewprojection AT wenbingtao imagedepthestimationassistedbymultiviewprojection

Image depth estimation assisted by multi-view projection

Similar Items