3D Semantic VSLAM of Indoor Environment Based on Mask Scoring RCNN

In view of existing Visual SLAM (VSLAM) algorithms when constructing semantic map of indoor environment, there are problems with low accuracy and low label classification accuracy when feature points are sparse. This paper proposed a 3D semantic VSLAM algorithm called BMASK-RCNN based on Mask Scorin...

Full description

Saved in:
Bibliographic Details
Main Authors: Chongben Tao, Yufeng Jin, Feng Cao, Zufeng Zhang, Chunguang Li, Hanwen Gao
Format: Article
Language:English
Published: Wiley 2020-01-01
Series:Discrete Dynamics in Nature and Society
Online Access:http://dx.doi.org/10.1155/2020/5916205
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832560044258885632
author Chongben Tao
Yufeng Jin
Feng Cao
Zufeng Zhang
Chunguang Li
Hanwen Gao
author_facet Chongben Tao
Yufeng Jin
Feng Cao
Zufeng Zhang
Chunguang Li
Hanwen Gao
author_sort Chongben Tao
collection DOAJ
description In view of existing Visual SLAM (VSLAM) algorithms when constructing semantic map of indoor environment, there are problems with low accuracy and low label classification accuracy when feature points are sparse. This paper proposed a 3D semantic VSLAM algorithm called BMASK-RCNN based on Mask Scoring RCNN. Firstly, feature points of images are extracted by Binary Robust Invariant Scalable Keypoints (BRISK) algorithm. Secondly, map points of reference key frame are projected to current frame for feature matching and pose estimation, and an inverse depth filter is used to estimate scene depth of created key frame to obtain camera pose changes. In order to achieve object detection and semantic segmentation for both static objects and dynamic objects in indoor environments and then construct dense 3D semantic map with VSLAM algorithm, a Mask Scoring RCNN is used to adjust its structure partially, where a TUM RGB-D SLAM dataset for transfer learning is employed. Semantic information of independent targets in scenes provides semantic information including categories, which not only provides high accuracy of localization but also realizes the probability update of semantic estimation by marking movable objects, thereby reducing the impact of moving objects on real-time mapping. Through simulation and actual experimental comparison with other three algorithms, results show the proposed algorithm has better robustness, and semantic information used in 3D semantic mapping can be accurately obtained.
format Article
id doaj-art-7f28c5b77d0c495bb90a03888a33691e
institution Kabale University
issn 1026-0226
1607-887X
language English
publishDate 2020-01-01
publisher Wiley
record_format Article
series Discrete Dynamics in Nature and Society
spelling doaj-art-7f28c5b77d0c495bb90a03888a33691e2025-02-03T01:28:29ZengWileyDiscrete Dynamics in Nature and Society1026-02261607-887X2020-01-01202010.1155/2020/591620559162053D Semantic VSLAM of Indoor Environment Based on Mask Scoring RCNNChongben Tao0Yufeng Jin1Feng Cao2Zufeng Zhang3Chunguang Li4Hanwen Gao5School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, Jiangsu, ChinaSchool of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, Jiangsu, ChinaSchool of Computer and Information Technology, Shanxi University, Taiyuan 030006, Shanxi, ChinaDepartment of Automation, Tsinghua University, Beijing 100084, ChinaSchool of Computer Information and Engineering, Changzhou Institute of Technology, Changzhou 213002, Jiangsu, ChinaSchool of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, Jiangsu, ChinaIn view of existing Visual SLAM (VSLAM) algorithms when constructing semantic map of indoor environment, there are problems with low accuracy and low label classification accuracy when feature points are sparse. This paper proposed a 3D semantic VSLAM algorithm called BMASK-RCNN based on Mask Scoring RCNN. Firstly, feature points of images are extracted by Binary Robust Invariant Scalable Keypoints (BRISK) algorithm. Secondly, map points of reference key frame are projected to current frame for feature matching and pose estimation, and an inverse depth filter is used to estimate scene depth of created key frame to obtain camera pose changes. In order to achieve object detection and semantic segmentation for both static objects and dynamic objects in indoor environments and then construct dense 3D semantic map with VSLAM algorithm, a Mask Scoring RCNN is used to adjust its structure partially, where a TUM RGB-D SLAM dataset for transfer learning is employed. Semantic information of independent targets in scenes provides semantic information including categories, which not only provides high accuracy of localization but also realizes the probability update of semantic estimation by marking movable objects, thereby reducing the impact of moving objects on real-time mapping. Through simulation and actual experimental comparison with other three algorithms, results show the proposed algorithm has better robustness, and semantic information used in 3D semantic mapping can be accurately obtained.http://dx.doi.org/10.1155/2020/5916205
spellingShingle Chongben Tao
Yufeng Jin
Feng Cao
Zufeng Zhang
Chunguang Li
Hanwen Gao
3D Semantic VSLAM of Indoor Environment Based on Mask Scoring RCNN
Discrete Dynamics in Nature and Society
title 3D Semantic VSLAM of Indoor Environment Based on Mask Scoring RCNN
title_full 3D Semantic VSLAM of Indoor Environment Based on Mask Scoring RCNN
title_fullStr 3D Semantic VSLAM of Indoor Environment Based on Mask Scoring RCNN
title_full_unstemmed 3D Semantic VSLAM of Indoor Environment Based on Mask Scoring RCNN
title_short 3D Semantic VSLAM of Indoor Environment Based on Mask Scoring RCNN
title_sort 3d semantic vslam of indoor environment based on mask scoring rcnn
url http://dx.doi.org/10.1155/2020/5916205
work_keys_str_mv AT chongbentao 3dsemanticvslamofindoorenvironmentbasedonmaskscoringrcnn
AT yufengjin 3dsemanticvslamofindoorenvironmentbasedonmaskscoringrcnn
AT fengcao 3dsemanticvslamofindoorenvironmentbasedonmaskscoringrcnn
AT zufengzhang 3dsemanticvslamofindoorenvironmentbasedonmaskscoringrcnn
AT chunguangli 3dsemanticvslamofindoorenvironmentbasedonmaskscoringrcnn
AT hanwengao 3dsemanticvslamofindoorenvironmentbasedonmaskscoringrcnn