Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information

Multiobject Tracking (MOT) is one of the most important abilities of autonomous driving systems. However, most of the existing MOT methods only use a single sensor, such as a camera, which has the problem of insufficient reliability. In this paper, we propose a novel Multiobject Tracking method by f...

Full description

Saved in:
Bibliographic Details
Main Authors: Liwei Zhang, Jiahong Lai, Zenghui Zhang, Zhen Deng, Bingwei He, Yucheng He
Format: Article
Language:English
Published: Wiley 2020-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2020/8810340
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832567953066819584
author Liwei Zhang
Jiahong Lai
Zenghui Zhang
Zhen Deng
Bingwei He
Yucheng He
author_facet Liwei Zhang
Jiahong Lai
Zenghui Zhang
Zhen Deng
Bingwei He
Yucheng He
author_sort Liwei Zhang
collection DOAJ
description Multiobject Tracking (MOT) is one of the most important abilities of autonomous driving systems. However, most of the existing MOT methods only use a single sensor, such as a camera, which has the problem of insufficient reliability. In this paper, we propose a novel Multiobject Tracking method by fusing deep appearance features and motion information of objects. In this method, the locations of objects are first determined based on a 2D object detector and a 3D object detector. We use the Nonmaximum Suppression (NMS) algorithm to combine the detection results of the two detectors to ensure the detection accuracy in complex scenes. After that, we use Convolutional Neural Network (CNN) to learn the deep appearance features of objects and employ Kalman Filter to obtain the motion information of objects. Finally, the MOT task is achieved by associating the motion information and deep appearance features. A successful match indicates that the object was tracked successfully. A set of experiments on the KITTI Tracking Benchmark shows that the proposed MOT method can effectively perform the MOT task. The Multiobject Tracking Accuracy (MOTA) is up to 76.40% and the Multiobject Tracking Precision (MOTP) is up to 83.50%.
format Article
id doaj-art-9e7fae2b919f451bbfdb2373b9fa07a2
institution Kabale University
issn 1076-2787
1099-0526
language English
publishDate 2020-01-01
publisher Wiley
record_format Article
series Complexity
spelling doaj-art-9e7fae2b919f451bbfdb2373b9fa07a22025-02-03T01:00:08ZengWileyComplexity1076-27871099-05262020-01-01202010.1155/2020/88103408810340Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion InformationLiwei Zhang0Jiahong Lai1Zenghui Zhang2Zhen Deng3Bingwei He4Yucheng He5School of Mechanical Engineering and Automation, Fuzhou University, Fuzhou, ChinaSchool of Mechanical Engineering and Automation, Fuzhou University, Fuzhou, ChinaSchool of Mechanical Engineering and Automation, Fuzhou University, Fuzhou, ChinaSchool of Mechanical Engineering and Automation, Fuzhou University, Fuzhou, ChinaSchool of Mechanical Engineering and Automation, Fuzhou University, Fuzhou, ChinaThe T Stone Robotics Institute, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Hong Kong, ChinaMultiobject Tracking (MOT) is one of the most important abilities of autonomous driving systems. However, most of the existing MOT methods only use a single sensor, such as a camera, which has the problem of insufficient reliability. In this paper, we propose a novel Multiobject Tracking method by fusing deep appearance features and motion information of objects. In this method, the locations of objects are first determined based on a 2D object detector and a 3D object detector. We use the Nonmaximum Suppression (NMS) algorithm to combine the detection results of the two detectors to ensure the detection accuracy in complex scenes. After that, we use Convolutional Neural Network (CNN) to learn the deep appearance features of objects and employ Kalman Filter to obtain the motion information of objects. Finally, the MOT task is achieved by associating the motion information and deep appearance features. A successful match indicates that the object was tracked successfully. A set of experiments on the KITTI Tracking Benchmark shows that the proposed MOT method can effectively perform the MOT task. The Multiobject Tracking Accuracy (MOTA) is up to 76.40% and the Multiobject Tracking Precision (MOTP) is up to 83.50%.http://dx.doi.org/10.1155/2020/8810340
spellingShingle Liwei Zhang
Jiahong Lai
Zenghui Zhang
Zhen Deng
Bingwei He
Yucheng He
Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information
Complexity
title Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information
title_full Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information
title_fullStr Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information
title_full_unstemmed Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information
title_short Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information
title_sort multimodal multiobject tracking by fusing deep appearance features and motion information
url http://dx.doi.org/10.1155/2020/8810340
work_keys_str_mv AT liweizhang multimodalmultiobjecttrackingbyfusingdeepappearancefeaturesandmotioninformation
AT jiahonglai multimodalmultiobjecttrackingbyfusingdeepappearancefeaturesandmotioninformation
AT zenghuizhang multimodalmultiobjecttrackingbyfusingdeepappearancefeaturesandmotioninformation
AT zhendeng multimodalmultiobjecttrackingbyfusingdeepappearancefeaturesandmotioninformation
AT bingweihe multimodalmultiobjecttrackingbyfusingdeepappearancefeaturesandmotioninformation
AT yuchenghe multimodalmultiobjecttrackingbyfusingdeepappearancefeaturesandmotioninformation